Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
On Tue, 18 Dec 2001 23:56:30 +1100, Russell Shaw <rjshaw@iprimus.com.au> wrote: >Muzaffer Kal wrote: >> >> On Mon, 17 Dec 2001 10:31:33 -0500, "Pallek, Andrew [CAR:CN34:EXCH]" >> <apallek@americasm01.nt.com> wrote: >> >> >If you just want to devide by 64, shift right by 6 places. The modulo is what was shifted >> >out. >> >> what if the dividend is negative ? > >The shift_right() function in ieee.numeric_bit operates on >signed numbers by maintaining the sign bits. For an unsigned >number, zeros are shifted in. Yes one would need to at least use arithmetic shift but that was not my point. If the dividend is negative a shift doesn't always work if what you want is division in the sense most people (and all adaptive filters) define it. Say we want to divide "n" by "d"; we can represent the operation by n = d * q + r where q is the quotient and r is the remainder. For this operation to be considered division at all we need |r| < |d| but the sign of the remainder is a little bit tricky because one can select it either positive or negative with a negative dividend, iow: -5 = 2 * (-2) + (-1) or -5 = 2 * (-3) + (+1) For the second choice the remainder is always positive regardless of the sign of dividend. This is not the commonly accepted definition of division (no cpu has an integer division which works like this and adaptive filters hate it with a passion). The usual definition of the remainder includes sign(r) = sign(x) The problem is that a shift doesn't do this. You get the second option which may or may not be what you want. It all depends on what your "definition" of division is. Muzaffer Kal http://www.dspia.com DSP algorithm implementations for FPGA systemsArticle: 37651
On Tue, 11 Dec 2001 11:51:57 +0100, "alco" <alco@cardiocontrol.com> wrote: (SNIP) > > - Has anything changed recently in the JTAG interface for the xc9536 that >might cause a microcontroller to fail programming the cpld. (SNIP) >select a version-2 9536 as a target device. > >Thanks, > >Alco Looye >alco@cardiocontrol.com > > > We did the same thing, and had the same problem. The newer XC9536 silicon requires a longer flash erase time. The fix for this is to manually edit your SVF file to account for this. Near the top of the file you should see four lines like this (each following an SDR 27 line) : RUNTEST 1300000 TCK; Change these to: RUNTEST 3000000 TCK; This increases the erase time from 1.3 seconds to 3.0 seconds. Regenerate your XSV file, and you will be OK. Also, see Xilinx Answers Database record 4475. You will notice that the SVF file generator produces an erase time of 1.3 seconds, even though the maximum erase time specified by Xilinx is 2.6 seconds. BTW, we cleaned up the Xilinx 8051 code to get rid of signed variables, unnecessarily long variables, and other inefficiencies. This halved the programming time. I told Xilinx that they should clean up their example code, and they basically told me to go away and leave them alone. =================================== Greg Neff VP Engineering *Microsym* Computers Inc. greg@guesswhichwordgoeshere.comArticle: 37652
This is a friendly and helpful newsgroup, but let's make sure that it does not get abused. Lots of textbooks explain how to divide by a power of 2, where the remainder is, and how you sign-extend the MSB. Explaining that is not the purpose of this newsgroup. Let's use our "bandwidth" for more complex and perhaps controversial questions that are not explained in textbooks and data books. Peter Alfke, Xilinx ApplicationsArticle: 37653
Greg Neff wrote: <snip> > BTW, we cleaned up the Xilinx 8051 code to get rid of signed > variables, unnecessarily long variables, and other inefficiencies. > This halved the programming time. I told Xilinx that they should > clean up their example code, and they basically told me to go away and > leave them alone. Interesting :-) I'm sure there is somewhere you could post the cleaned up code... What was the final 8051 Code / RAM footprint, after you did this ? Did you look at run length compression, or just use a BIT file copy ? -jgArticle: 37654
Maybe a more general explanation is in order: It is inherently impossible to extend a bidirectional line across a conventional amplifier. To go across a chip boundary, you have to know the signal flow and activate the appropriate driver. The "wired AND Longline" cannot pass through the chip boundary, since the I/O contains an amplifier, and a conventional amplifier is always unidirectional. Peter Alfke ================================ Falk Brunner wrote: > "Wilco Vahrmeijer" <wilco@cardiocontrol.com> schrieb im Newsbeitrag > news:9vnl4h$1j1i$1@news.versatel.net... > > Hi all, > > > > We've got a problem with FPGA express (FPGAexpress 3.6.6613 (attached bij > > Xilinx ISE 4.1)) and bidir pins with a Xilinx device: > > > > I've made two blocks and each block has control signals and one > > bidirectional pin (tri-state buffered). On the upper layer, this two > signals > > are routed to the same output pin. (See attachments) > > > > The problem is a warning from FPGA express: > > "FPGA-pmap-18 (1 Occurrence) Warning: The port type of port > > '/TryOutBiDir-1/BiDirPin' is unknown. An output pad will be inserted" > > > > and FPGA express insert a Outputbuffer instead of a bidir buffer. Internal > > the signal is bidirectional, to the outside it's unidirectional. > > > > I want a bidirectional output pin !! Can somebody help me?? > > To have a bidirectional bus inside AND outside the FPGA you have to isolate > them. > > entity tristate is > port > ( > BiDirPin: inout STD_LOGIC > ); > end TryOutBiDir; > > architecture TryOutBiDir_arch of TryOutBiDir is > > component driver > port > ( > Write2Readed: in STD_LOGIC; > highZ: in STD_LOGIC; > Readed: out STD_LOGIC; > DQ: inout STD_LOGIC > ); > end component; > > begin > > Driver_1 : Driver port map (Write2Readed1,HighZ1,Readed1,BiDirPin_int); > Driver_2 : Driver port map (Write2Readed2,HighZ2,Readed2,BiDirPin_int); > > BiDirPin<=BidirPin_int when con='1' else 'Z'; > > end TryOutBiDir_arch; > > This code ist not complete, the signal declarations are missing. You also > need to generate the con signal, which controls the Tristate driver of the > IO Pin. > > -- > MfG > FalkArticle: 37655
On Tue, 18 Dec 2001 10:29:01 -0800, Peter Alfke <peter.alfke@xilinx.com> wrote: >This is a friendly and helpful newsgroup, but let's make sure that it does not >get abused. >Lots of textbooks explain how to divide by a power of 2, where the remainder is, >and how you sign-extend the MSB. Explaining that is not the purpose of this >newsgroup. I don't think I have ever seen the charter of this group but I know what you mean. It is as if all the people reading are sitting in a circle and when one asks "how do I divide by two?" everybody starts chanting "shift, shift, shift". But if a question is asked I think it needs to be answered; in a correct and comprehensive manner. Muzaffer Kal http://www.dspia.com DSP algorithm implementations for FPGA systemsArticle: 37656
This came as a result of thinking about how to make more efficient barrel shifters. Most readers are familiar with fall through barrel shifters and how they're usually implemented. (With columns of 2 to 1 muxes that each column dedicated to shifting the data by a different power of 2.) I got to thinking about how to use MUXF5s for barrel shifters. It's clear that if you could bring out the terms that feed the MUXF5 you could get three results out of a slice instead of just two. That would essentially give me three 2 to 1 muxes in one slice instead of two, and it would really improve barrel shifter packing (and maybe be good for random logic use). The problem with doing it is that it's hard to get the output of the "F" LUT out of the slice. But it can be done by brining it out the CARRY-OUT. You use a MUXCY, and apply '0' and '1' to the DI and CI inputs, and the "F" LUT output to the S input of the MUXCY. That programs the MUXCY to be a buffer of the "F" LUT, and you get the (otherwise hidden) "F" LUT output as the carry-out (which can easily route to the outside of the slice). The only problem with that is that it's very hard to program the CI of the MUXCY to '1' and still use the MUXF5. In fact, since the BX input is going to have to be used by the 'S' input of the MUXF5, you have to use the carry input of the slice. And that puts the problem of generating a '1' into the next door neighbor to the slice, where you'll have to use a LUT to generate it, thereby wasting the LUT you were trying to save. That was where my analysis ended, but the other night I realized that for a barrel shifter, I don't have to control the value of that carry-out at all times. I only need to control it when I'm actually going to select it in the next stage of logic. And since the selector for the next stage of logic will be the same selector as is coming in on the 'S' pin for the MUXF5, that suggests that there might be a solution. In fact there is. You program the "DI" input of the MUXCY to '1', and connect both the MUXCY.CI input and the MUXF5.S input to the same BX input. That's a natural use for the carry input, but it's usually not done because normally arithmetic logic is not used at the same time as the MUXF5 pin. But you can do it. The result is that when the MUXF5 selects the "F" LUT, (i.e. when the BX input is '1'), the MUXF5 operates normally, but the Carry-out will be forced to '1'. But that's the condition under which you would normally ignore the carry-out anyway, if you were using the circuit as a barrel shifter (with positive shift amount). On the other hand, when the SHIFT input is low, the "G" LUT is selected for the MUXF5, and the DI and CI inputs of the MUXCY end up as '1' and '0' respectively. That causes the CARRY-OUT to follow the complement of the "F" LUT output. But we all know how easy it is to invert logic in a Xilinx, so I just put an inverter on the CARRY-OUT and the logic takes care of getting rid of the inversion for me. The router isn't too good at combining MUXCYs and MUXF5s, so to get this into a single slice I have to RLOC it. But the good news is that this works. Generalizing, this means that I can get certain collections of 3 logic functions in a single slice. The general rule is: The three logic functions are {F,G',F5}, and the 9 input variables are {F1,F2,F3,F4,G1,G2,G3,G4, and BX} F <= LUT(F1,F2,F3,F4); G' <= LUT(G1,G2,G3,G4) nand BX; with BX select F5 <= G when '0', F when others; I thought this was cool. It allows a 16-bit wide 0 to 7 bit barrel shifter in just 36 LUTs which is 12 less than the number needed to create a barrel shift the usual way. The logic for the above was implemented in schematics, but I could easily convert this to VHDL if anyone is interested. I also figured out a way to program a column of slices to perform a vector of 3 to 1 muxes instead of just 2 to 1 muxes. This can be used to create very efficient barrel shifters where the shift amount is a power of 3. An example would be a barrel shifter that shifts between 0 and 8 bits. With the usual barrel shift technique, such a barrel shifter would require 4 stages, but using 3 to 1 muxes it requires only 2 stages. There's some fixed costs associated with computing the controls for the stages, (and since it uses arithmetic functions), driving the CARRY-IN for each stage. (It's actually more complex than I'm implying here.) I'll post code for it if anyone is interested. Carl -- Posted from firewall.terabeam.com [216.137.15.2] via Mailgate.ORG Server - http://www.Mailgate.ORGArticle: 37657
I don't know about you guys and gals and pals, but everytime I do a design, without exception, I ALWAYS go into the Xilinx Design Manager Design --> Optiions --> Implementation Edit Options and select and select "Inputs and Outputs" for Pack I/O Registers/Latches into IOBs for. I ALWAYS want my designs to use IOB flip flops if possible. It seems to me that the default "Off" is a waste of these flip flops. Does anyone here every turn this off? Simon Ramirez, Consultant Synchronous Design, Inc. Oviedo, FL USAArticle: 37658
Hello, This could be very useful for optimizing floating point adders or substracters, since they use large barrel shifters for normalization and denormalization. If you have some VHDL for this I'd be eager to use it to see how it affects area and speed !! Steven Carl Brannen wrote: > This came as a result of thinking about how to make more efficient barrel > shifters. Most readers are familiar with fall through barrel shifters and how > they're usually implemented. (With columns of 2 to 1 muxes that each column > dedicated to shifting the data by a different power of 2.) > > I got to thinking about how to use MUXF5s for barrel shifters. It's clear that > if you could bring out the terms that feed the MUXF5 you could get three > results out of a slice instead of just two. That would essentially give me > three 2 to 1 muxes in one slice instead of two, and it would really improve > barrel shifter packing (and maybe be good for random logic use). > > The problem with doing it is that it's hard to get the output of the "F" LUT > out of the slice. But it can be done by brining it out the CARRY-OUT. You use > a MUXCY, and apply '0' and '1' to the DI and CI inputs, and the "F" LUT output > to the S input of the MUXCY. That programs the MUXCY to be a buffer of the "F" > LUT, and you get the (otherwise hidden) "F" LUT output as the carry-out (which > can easily route to the outside of the slice). > > The only problem with that is that it's very hard to program the CI of the > MUXCY to '1' and still use the MUXF5. In fact, since the BX input is going to > have to be used by the 'S' input of the MUXF5, you have to use the carry input > of the slice. And that puts the problem of generating a '1' into the next door > neighbor to the slice, where you'll have to use a LUT to generate it, thereby > wasting the LUT you were trying to save. > > That was where my analysis ended, but the other night I realized that for a > barrel shifter, I don't have to control the value of that carry-out at all > times. I only need to control it when I'm actually going to select it in the > next stage of logic. And since the selector for the next stage of logic will > be the same selector as is coming in on the 'S' pin for the MUXF5, that > suggests that there might be a solution. > > In fact there is. You program the "DI" input of the MUXCY to '1', and connect > both the MUXCY.CI input and the MUXF5.S input to the same BX input. That's a > natural use for the carry input, but it's usually not done because normally > arithmetic logic is not used at the same time as the MUXF5 pin. But you can do > it. > > The result is that when the MUXF5 selects the "F" LUT, (i.e. when the BX input > is '1'), the MUXF5 operates normally, but the Carry-out will be forced to '1'. > But that's the condition under which you would normally ignore the carry-out > anyway, if you were using the circuit as a barrel shifter (with positive shift > amount). > > On the other hand, when the SHIFT input is low, the "G" LUT is selected for the > MUXF5, and the DI and CI inputs of the MUXCY end up as '1' and '0' > respectively. That causes the CARRY-OUT to follow the complement of the "F" > LUT output. But we all know how easy it is to invert logic in a Xilinx, so I > just put an inverter on the CARRY-OUT and the logic takes care of getting rid > of the inversion for me. > > The router isn't too good at combining MUXCYs and MUXF5s, so to get this into a > single slice I have to RLOC it. But the good news is that this works. > > Generalizing, this means that I can get certain collections of 3 logic > functions in a single slice. The general rule is: > > The three logic functions are {F,G',F5}, and the 9 input variables are > {F1,F2,F3,F4,G1,G2,G3,G4, and BX} > > F <= LUT(F1,F2,F3,F4); > > G' <= LUT(G1,G2,G3,G4) nand BX; > > with BX select > F5 <= > G when '0', > F when others; > > I thought this was cool. It allows a 16-bit wide 0 to 7 bit barrel shifter in > just 36 LUTs which is 12 less than the number needed to create a barrel shift > the usual way. > > The logic for the above was implemented in schematics, but I could easily > convert this to VHDL if anyone is interested. > > I also figured out a way to program a column of slices to perform a vector of 3 > to 1 muxes instead of just 2 to 1 muxes. This can be used to create very > efficient barrel shifters where the shift amount is a power of 3. An example > would be a barrel shifter that shifts between 0 and 8 bits. With the usual > barrel shift technique, such a barrel shifter would require 4 stages, but using > 3 to 1 muxes it requires only 2 stages. There's some fixed costs associated > with computing the controls for the stages, (and since it uses arithmetic > functions), driving the CARRY-IN for each stage. (It's actually more complex > than I'm implying here.) I'll post code for it if anyone is interested. > > Carl > > -- > Posted from firewall.terabeam.com [216.137.15.2] > via Mailgate.ORG Server - http://www.Mailgate.ORGArticle: 37659
I agree that it looks very clever and interesting. But, just as an aside, floating point need arithmetic shifters for normalization, not barrel shifters. Also, remember that Virtex-II has lots of multipliers, many of them begging to be used as "free" shifters ( multipliy by a power of 2 ) Peter Alfke ==================================== Steven Derrien wrote: > Hello, > > This could be very useful for optimizing floating point adders or substracters, > since > they use large barrel shifters for normalization and denormalization. If you have > some VHDL for this I'd be eager to use it to see how it affects area and speed !! > > Steven > > Carl Brannen wrote: > > > This came as a result of thinking about how to make more efficient barrel > > shifters. Most readers are familiar with fall through barrel shifters and how > > they're usually implemented. (With columns of 2 to 1 muxes that each column > > dedicated to shifting the data by a different power of 2.) > > > > I got to thinking about how to use MUXF5s for barrel shifters. It's clear that > > if you could bring out the terms that feed the MUXF5 you could get three > > results out of a slice instead of just two. That would essentially give me > > three 2 to 1 muxes in one slice instead of two, and it would really improve > > barrel shifter packing (and maybe be good for random logic use). > > > > The problem with doing it is that it's hard to get the output of the "F" LUT > > out of the slice. But it can be done by brining it out the CARRY-OUT. You use > > a MUXCY, and apply '0' and '1' to the DI and CI inputs, and the "F" LUT output > > to the S input of the MUXCY. That programs the MUXCY to be a buffer of the "F" > > LUT, and you get the (otherwise hidden) "F" LUT output as the carry-out (which > > can easily route to the outside of the slice). > > > > The only problem with that is that it's very hard to program the CI of the > > MUXCY to '1' and still use the MUXF5. In fact, since the BX input is going to > > have to be used by the 'S' input of the MUXF5, you have to use the carry input > > of the slice. And that puts the problem of generating a '1' into the next door > > neighbor to the slice, where you'll have to use a LUT to generate it, thereby > > wasting the LUT you were trying to save. > > > > That was where my analysis ended, but the other night I realized that for a > > barrel shifter, I don't have to control the value of that carry-out at all > > times. I only need to control it when I'm actually going to select it in the > > next stage of logic. And since the selector for the next stage of logic will > > be the same selector as is coming in on the 'S' pin for the MUXF5, that > > suggests that there might be a solution. > > > > In fact there is. You program the "DI" input of the MUXCY to '1', and connect > > both the MUXCY.CI input and the MUXF5.S input to the same BX input. That's a > > natural use for the carry input, but it's usually not done because normally > > arithmetic logic is not used at the same time as the MUXF5 pin. But you can do > > it. > > > > The result is that when the MUXF5 selects the "F" LUT, (i.e. when the BX input > > is '1'), the MUXF5 operates normally, but the Carry-out will be forced to '1'. > > But that's the condition under which you would normally ignore the carry-out > > anyway, if you were using the circuit as a barrel shifter (with positive shift > > amount). > > > > On the other hand, when the SHIFT input is low, the "G" LUT is selected for the > > MUXF5, and the DI and CI inputs of the MUXCY end up as '1' and '0' > > respectively. That causes the CARRY-OUT to follow the complement of the "F" > > LUT output. But we all know how easy it is to invert logic in a Xilinx, so I > > just put an inverter on the CARRY-OUT and the logic takes care of getting rid > > of the inversion for me. > > > > The router isn't too good at combining MUXCYs and MUXF5s, so to get this into a > > single slice I have to RLOC it. But the good news is that this works. > > > > Generalizing, this means that I can get certain collections of 3 logic > > functions in a single slice. The general rule is: > > > > The three logic functions are {F,G',F5}, and the 9 input variables are > > {F1,F2,F3,F4,G1,G2,G3,G4, and BX} > > > > F <= LUT(F1,F2,F3,F4); > > > > G' <= LUT(G1,G2,G3,G4) nand BX; > > > > with BX select > > F5 <= > > G when '0', > > F when others; > > > > I thought this was cool. It allows a 16-bit wide 0 to 7 bit barrel shifter in > > just 36 LUTs which is 12 less than the number needed to create a barrel shift > > the usual way. > > > > The logic for the above was implemented in schematics, but I could easily > > convert this to VHDL if anyone is interested. > > > > I also figured out a way to program a column of slices to perform a vector of 3 > > to 1 muxes instead of just 2 to 1 muxes. This can be used to create very > > efficient barrel shifters where the shift amount is a power of 3. An example > > would be a barrel shifter that shifts between 0 and 8 bits. With the usual > > barrel shift technique, such a barrel shifter would require 4 stages, but using > > 3 to 1 muxes it requires only 2 stages. There's some fixed costs associated > > with computing the controls for the stages, (and since it uses arithmetic > > functions), driving the CARRY-IN for each stage. (It's actually more complex > > than I'm implying here.) I'll post code for it if anyone is interested. > > > > Carl > > > > -- > > Posted from firewall.terabeam.com [216.137.15.2] > > via Mailgate.ORG Server - http://www.Mailgate.ORGArticle: 37660
Spartan-IIE: I am urgently looking for a (board-level) schematic symbol (preferably ORCAD or VIEWLOGIC) for an XC2S100E-6FT256C Xilinx FPGA. Is anyone in a position to help on this? -Thanks in advance :-)Article: 37661
"S. Ramirez" <sramirez@cfl.rr.com> wrote in message news:zKMT7.137295$Ga5.21230731@typhoon.tampabay.rr.com... > I don't know about you guys and gals and pals, but everytime I do a > design, without exception, I ALWAYS go into the Xilinx Design Manager > Design --> Optiions --> Implementation Edit Options and select and select > "Inputs and Outputs" for Pack I/O Registers/Latches into IOBs for. I ALWAYS > want my designs to use IOB flip flops if possible. It seems to me that the > default "Off" is a waste of these flip flops. Does anyone here every turn > this off? > Simon Ramirez, Consultant > Synchronous Design, Inc. > Oviedo, FL USA Hi Simon, That's what you get for using Design Mangler...er...Manager ;-) AustinArticle: 37662
Why not just ask the magic eight ball? Never mind, it would probably tell him to try again. Andy Peters wrote: > You could simulate it, and find out for yourself if it is OK. > > OK? > > "chensw20hotmail.com" wrote: > > > > Now,i want to implement it by counter controlling.is it OK? > > > > /*counter[2:0] works if read enable.Data was be shifted by counter control*/ > > always @(posedge NA_Clock or negedge Rst ) > > begin > > if(Rst) > > NA_Count<=0; > > else if(NA_Read_Enable) NA_Count<=NA_Count+1; > > else NA_Count<=0; > > end > > > > /*data read out from fifo were allocated is NA_Des_Data0.1.....7 dividually NA_Data_Out[15:0] :fifo data out NA_Des_Data[7:0] [15:0] */ always @(posedge NA_Clock or negedge Rst ) > > begin > > if(Rst) > > begin > > NA_Des_Data0 <=16'b0; > > NA_Des_Data1 <=16'b0; > > NA_Des_Data2 <=16'b0; > > NA_Des_Data3 <=16'b0; > > NA_Des_Data4 <=16'b0; > > NA_Des_Data5 <=16'b0; > > NA_Des_Data6 <=16'b0; > > NA_Des_Data7 <=16'b0; > > > > end > > else > > case(NA_Count) > > 3'b000: NA_Des_Data0 <=NA_Data_Out; 3'b001: NA_Des_Data1 <=NA_Data_Out; 3'b010: NA_Des_Data2 <=NA_Data_Out; 3'b011: NA_Des_Data3 <=NA_Data_Out; 3'b100: NA_Des_Data4 <=NA_Data_Out; 3'b101: NA_Des_Data5 <=NA_Data_Out; 3'b110: NA_Des_Data6 <=NA_Data_Out; 3'b111: NA_Des_Data7 <=NA_Data_Out; default : > > begin > > NA_Des_Data0 <=16'b0; > > NA_Des_Data1 <=16'b0; > > NA_Des_Data2 <=16'b0; > > NA_Des_Data3 <=16'b0; > > NA_Des_Data4 <=16'b0; > > NA_Des_Data5 <=16'b0; > > NA_Des_Data6 <=16'b0; > > NA_Des_Data7 <=16'b0; > > end > > endcase > > end > > is it OK? > > Thanks -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 37663
On Wed, 19 Dec 2001 07:54:19 +1300, Jim Granville <jim.granville@designtools.co.nz> wrote: >Greg Neff wrote: ><snip> >> BTW, we cleaned up the Xilinx 8051 code to get rid of signed >> variables, unnecessarily long variables, and other inefficiencies. >> This halved the programming time. I told Xilinx that they should >> clean up their example code, and they basically told me to go away and >> leave them alone. > > Interesting :-) > > I'm sure there is somewhere you could post the cleaned up code... > > What was the final 8051 Code / RAM footprint, after you did this ? > > Did you look at run length compression, or just use a BIT file copy ? > > -jg =================================== Greg Neff VP Engineering *Microsym* Computers Inc. greg@guesswhichwordgoeshere.comArticle: 37664
Here's my two cents worth (maybe not even that much). 1) Peter, the term barrel shift is commonly (although technically incorrectly) applied to shifters which have a variable shift distance. The virtex II multipliers can in fact be used this way, but it can be done considerably faster (with more pipelining) in the fabric for very little additional cost, especially when you consider the resources taken by the added pipeline registers you need in front of and behind the multiplier to get any where close to the data sheet speeds. It all comes down to how do I best use the resources available to me. 2) The carry chain can also be used for a free doubler circuit. However, watch the timing. There exist false paths (that are also quite slow comparatively speaking) introduced by the non-standard use of the carry chain (the chain connections are only used to the next neighbor, not all the way up the chain). Timingwise, the conventional approach seems to yield better propagation delays in combinatorial only shifters, and considerably better times in fully pipelined shifters. This is a good trick to put in your back pocket for those times where the need for density outweighs the needs of the clock cycle. 3) I'd be interested in seeing your layout solution. The layout is not trivial to making this perform well. -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 37665
On Wed, 19 Dec 2001 07:54:19 +1300, Jim Granville <jim.granville@designtools.co.nz> wrote: >Greg Neff wrote: ><snip> >> BTW, we cleaned up the Xilinx 8051 code to get rid of signed >> variables, unnecessarily long variables, and other inefficiencies. >> This halved the programming time. I told Xilinx that they should >> clean up their example code, and they basically told me to go away and >> leave them alone. > > Interesting :-) > > I'm sure there is somewhere you could post the cleaned up code... No can do. We developed the code for a paying customer, so we are not free to give it away. > > What was the final 8051 Code / RAM footprint, after you did this ? It was a few modules of a large piece of automatic production programming and test code. A scanned the link map, and it looks like it used about 2,300 bytes of code, and 36 bytes of RAM in data space. > > Did you look at run length compression, or just use a BIT file copy ? > I just used the standard JED -> SVF -> XSVF file flow. I stored this file, as well as other production XSVF and HEX files, in a big external flash. As part of the tester FPGA I built an automatic address incrementer for this big flash, so that I could easily sequentially read each byte from the files to be programmed into the devices on the UUT. =================================== Greg Neff VP Engineering *Microsym* Computers Inc. greg@guesswhichwordgoeshere.comArticle: 37666
Typically, the schematic symbol isn't a "canned" symbol, it is a custom symbol tailored to the function/pinout of the FPGA. They don't take THAT long to make if you know the tool... Does anyone here **really** use a "canned" symbol for their FPGAs? For a PAL, yes, but an FPGA? "Peter Fenn" <Peter.Fenn@avnet.com> wrote in message news:ee73c6a.-1@WebX.sUN8CHnE... > Spartan-IIE: I am urgently looking for a (board-level) schematic symbol (preferably ORCAD or VIEWLOGIC) for an XC2S100E-6FT256C Xilinx FPGA. Is anyone in a position to help on this? > -Thanks in advance :-)Article: 37667
Hi Peter, "Peter Alfke" <peter.alfke@xilinx.com> wrote in message news:3C1F8AEC.BFD2E067@xilinx.com... > This is a friendly and helpful newsgroup, but let's make sure that it does not > get abused. > Lots of textbooks explain how to divide by a power of 2, where the remainder is, > and how you sign-extend the MSB. Explaining that is not the purpose of this > newsgroup. Where does it say that? I've never seen the charter, but I certainly wouldn't turn away a question of how to do division in an FPGA because it's in a textbook! > Let's use our "bandwidth" for more complex and perhaps controversial questions > that are not explained in textbooks and data books. Why? If someone doesn't like a particular discussion topic, then cripes, just don't read it! Regards, AustinArticle: 37668
On Tue, 18 Dec 2001 16:57:29 -0500, "Austin Franklin" <austin@dark98room.com> wrote: (snip)> >Does anyone here **really** use a "canned" symbol for their FPGAs? For a >PAL, yes, but an FPGA? > (snip) Not a chance. With large pin counts we usually build a heterogeneous symbol so that we can put different functional blocks of the FPGA on different schematic sheets. This makes the design easier to follow. Unfortunately, it is a tedious manual process that has to be checked and double-checked. =================================== Greg Neff VP Engineering *Microsym* Computers Inc. greg@guesswhichwordgoeshere.comArticle: 37669
So lets talk controversial.... If Lucent can support hard macros in Epic with hard routing, then why can't Xilinx. My application requires it and Xilinx doesn't support it in FPGA editor(which was programmed by the same softies as Epic). Oh, I remember why they don't support it. Because nobody cares about designs that push the limitations of FPGAs. Because everybody else that is making designs for Xilinx parts is still in kindergarten finger painting with verilog and hdl. Ha, I didn't get my EE degree to be a soft weirdo. Anybody can throw code together and get poor performance. flame away kindergarten kids Bryan "Peter Alfke" <peter.alfke@xilinx.com> wrote in message news:3C1F8AEC.BFD2E067@xilinx.com... > This is a friendly and helpful newsgroup, but let's make sure that it does not > get abused. > Lots of textbooks explain how to divide by a power of 2, where the remainder is, > and how you sign-extend the MSB. Explaining that is not the purpose of this > newsgroup. > > Let's use our "bandwidth" for more complex and perhaps controversial questions > that are not explained in textbooks and data books. > > Peter Alfke, Xilinx Applications > >Article: 37670
> The Altera NIOS softcore processor comes with a flexible, parameterizable > SPI interface module in VHDL or Verilog. The complete NIOS license with all > tools, board and of course SPI is US-$ 995,- > Check out: For that matter, the Xilinx XS95 and XS40 boards come with SPI (they are, aren't they? Correct me if I'm wrong) compatible codecs on them. Check out http://www.xess.com for the documentation. There's source for a SPI library that may do what you want. Cheers, Adam ================================= IMPORTANT: This email and any attachments may be confidential. Any retransmissions, dissemination or other use of these materials by persons or entities other than the intended recipient is prohibited. If received in error, please contact us and delete all copies. Before opening or using attachments, check them for viruses and defects. Our liability is limited to resupplying any affected attachments. [Any representations or opinions expressed in this e.mail are those of the individual sender, and not necessarily those of Vision Systems Limited]Article: 37671
Let's see how long a VHDL chunk will fit in this forum... library IEEE; use IEEE.std_logic_1164.all; -- Divide by 3 circuit example. Uses only 85 LUTs -- (up to 4-input) to compute the quotient and -- remainder when a 32-bit input is divided by 3. -- -- Designer: Carl Brannen -- -- Feel free to modify this circuit and use it in -- your own designs. I am aware of no patents that -- it infringes on, but you will have to make your -- own determination of this. My only request is -- that you leave a comment to the effect that your -- knowledge of the algorithm is through me. -- -- Synthesize with optimize set for "low", and -- "area". This circuit is already optimized, -- the computer will only be waste its time (and likely -- increase the size and delay of the result) if it -- tries to optimize further. -- -- This code was written in response to this post -- on the comp.arch.FPGA thread: -- -- <<< -- "I need to implement in an fpga an algorithm that will divide an integer -- by 3. The dividend length is still to be determined but will be -- somewhere between 20 and 30 bits, and the divisor is always the number -- 3. -- -- Does anyone know an efficient combinatoric algorithm that can accomplish -- this? -- >>> -- -- http://www.fpga-faq.com/archives/11400.html#11409 entity DIV32_3 is port ( CLK: in STD_LOGIC; AIN: in STD_LOGIC_VECTOR(31 downto 0); REMOUT: out STD_LOGIC_VECTOR( 1 downto 0); QOUT: out STD_LOGIC_VECTOR(30 downto 0); TEST: out STD_LOGIC_VECTOR(53 downto 0) ); end DIV32_3; architecture DIV32_3_arch of DIV32_3 is -- Partial remainders: signal R1V: STD_LOGIC_VECTOR(15 downto 0); -- 16 LUTs signal R2V: STD_LOGIC_VECTOR( 7 downto 0); -- 8 LUTs signal R4V: STD_LOGIC_VECTOR( 3 downto 0); -- 4 LUTs signal R8V: STD_LOGIC_VECTOR( 1 downto 0); -- 2 LUTs signal P4V: STD_LOGIC_VECTOR( 1 downto 0); -- 2 LUTs signal P3V: STD_LOGIC_VECTOR( 1 downto 0); -- 2 LUTs signal P2V: STD_LOGIC_VECTOR( 1 downto 0); -- 2 LUTs signal P1V: STD_LOGIC_VECTOR( 1 downto 0); -- 2 LUTs signal P0V: STD_LOGIC_VECTOR( 1 downto 0); -- 2 LUTs -- Rearrangement of partial remainders only: signal PRV: STD_LOGIC_VECTOR(15 downto 0); -- 16 LUTs -- carries internal to blocks: signal X0V: STD_LOGIC_VECTOR(13 downto 0); -- 14 LUTs -- Flip-flop for QOUT: signal QOUTQ,QOUTD: STD_LOGIC_VECTOR(30 downto 0); -- 31 LUTs -- Flip-flop for Remainder: signal REMQ,REMD: STD_LOGIC_VECTOR( 1 downto 0); -- -- ------- -- Total LUT count: -- 85 LUTs (45 slices or 23 CLBs) -- Force tool to not "optimize" (i.e. bloat) the design -- by creating a set of flip-flop outputs. signal FKD,FKQ: STD_LOGIC_VECTOR(53 downto 0); begin -- Scheme for quickest determination of remainder when dividing by 3. -- Example, 32-bit input, R80 provides the 2-bit remainder result in -- just 4 stages of 4-input LUTs: -- -- AIN -- 3322 2222 2222 1111 1111 1100 0000 0000 -- 1098 7654 3210 9876 5432 1098 7654 3210 -- ---- ---- ---- ---- ---- ---- ---- ---- -- R17 R16 R15 R14 R13 R12 R11 R10 -- \ / \ / \ / \ / -- R23 R22 R21 R20 -- \ / \ / -- R41 R40 -- -----\ /----- -- R80 -- -- Scheme for computing quotients from the above -- remainder scheme. More partial remainders have -- to be computed, as compared to the above remainders, -- these ones are called PRs. -- -- The quotient for the highest four bits is computed -- directly from (no greater than 4-input) LUTs. The -- lower quotients all require a remainder input. That -- remainder allows direct computation of the quotient -- for the high two bits, and a partial remainder needs -- to be computed to get the lower 2 bits as well. The -- following diagram suppresses Rxx that aren't used, and -- only shows how the PRs are calculated. The lowest -- value in each column gives the Rxx or PRx that computes -- the partial remainder at that column: -- -- AIN -- 3322 2222 2222 1111 1111 1100 0000 0000 -- 1098 7654 3210 9876 5432 1098 7654 3210 -- ---- ---- ---- ---- ---- ---- ---- ---- -- R17 R15 R13 R11 -- | | | -- P4 | R21 | -- / | | \ | -- R23 P3 P2 P1 -- / / | -- / / P0 -- /-----/-----/ -- R41 -- -- ---- ---- ---- ---- ---- ---- ---- ---- -- R17 R23 P4 R41 P3 P2 P0 R8 -- PR7 PR6 PR5 PR4 PR3 PR2 PR1 PR0 -- -- -- From the above tables, it's clear that the -- longest computation is that of the quotient -- at position 0, as would be expected. The -- number of stages of logic is only 6, and the -- longest paths are as follows: -- -- R17 R16 R15 R14 R13 R12 -- \ / \ / \ / -- R23 R22 R21 -- \ / \ -- R41 P1 -- \ / -- P0 -- | -- Remainder[3:2] -- | -- Quotient[1:0] -- -- -- In order to make the VHDL shorter, I've packed -- the remainders into longer STD_LOGIC_VECTORs -- as follows: -- R1V <= R17 & R16 & ... R10 -- R2V <= R23 & R22 & R21 & R20 -- R4V <= R41 & R40 -- R8V <= R80 -- -- -- I normally don't like to complicate things any more -- than they have to, but I hate to have to create all -- those unnecessary "SEL" assignments. -- If Xilinx would support a select statement like this: -- -- with AIN(4*I+3 downto 4*I) -- -- I wouldn't have to do this this way, but this is the first -- way to implement this that comes to mind. I guess I could -- define LUT4s, since none of these are trivial, but that wouldn't -- port to Altera. -- Generate the R1x logic (16 LUTs) G1: for I in 0 to 7 generate R1V(I*2 + 0) <= ((not AIN(4*I+3)) and (not AIN(4*I+2)) and (not AIN(4*I+1)) and ( AIN(4*I+0))) or ((not AIN(4*I+3)) and ( AIN(4*I+2)) and (not AIN(4*I+1)) and (not AIN(4*I+0))) or ((not AIN(4*I+3)) and ( AIN(4*I+2)) and ( AIN(4*I+1)) and ( AIN(4*I+0))) or (( AIN(4*I+3)) and (not AIN(4*I+2)) and ( AIN(4*I+1)) and (not AIN(4*I+0))) or (( AIN(4*I+3)) and ( AIN(4*I+2)) and (not AIN(4*I+1)) and ( AIN(4*I+0))); R1V(I*2 + 1) <= ((not AIN(4*I+3)) and (not AIN(4*I+2)) and ( AIN(4*I+1)) and (not AIN(4*I+0))) or ((not AIN(4*I+3)) and ( AIN(4*I+2)) and (not AIN(4*I+1)) and ( AIN(4*I+0))) or (( AIN(4*I+3)) and (not AIN(4*I+2)) and (not AIN(4*I+1)) and (not AIN(4*I+0))) or (( AIN(4*I+3)) and (not AIN(4*I+2)) and ( AIN(4*I+1)) and ( AIN(4*I+0))) or (( AIN(4*I+3)) and ( AIN(4*I+2)) and ( AIN(4*I+1)) and (not AIN(4*I+0))); end generate; -- Generate the R2x logic (8 LUTs) G2: for I in 0 to 3 generate R2V(I*2 + 0) <= ((not R1V(4*I+3)) and (not R1V(4*I+2)) and (not R1V(4*I+1)) and ( R1V(4*I+0))) or ((not R1V(4*I+3)) and ( R1V(4*I+2)) and (not R1V(4*I+1)) and (not R1V(4*I+0))) or ((not R1V(4*I+3)) and ( R1V(4*I+2)) and ( R1V(4*I+1)) and ( R1V(4*I+0))) or (( R1V(4*I+3)) and (not R1V(4*I+2)) and ( R1V(4*I+1)) and (not R1V(4*I+0))) or (( R1V(4*I+3)) and ( R1V(4*I+2)) and (not R1V(4*I+1)) and ( R1V(4*I+0))); R2V(I*2 + 1) <= ((not R1V(4*I+3)) and (not R1V(4*I+2)) and ( R1V(4*I+1)) and (not R1V(4*I+0))) or ((not R1V(4*I+3)) and ( R1V(4*I+2)) and (not R1V(4*I+1)) and ( R1V(4*I+0))) or (( R1V(4*I+3)) and (not R1V(4*I+2)) and (not R1V(4*I+1)) and (not R1V(4*I+0))) or (( R1V(4*I+3)) and (not R1V(4*I+2)) and ( R1V(4*I+1)) and ( R1V(4*I+0))) or (( R1V(4*I+3)) and ( R1V(4*I+2)) and ( R1V(4*I+1)) and (not R1V(4*I+0))); end generate; -- Generate the R4x logic (4 LUTs) G4: for I in 0 to 1 generate R4V(I*2 + 0) <= ((not R2V(4*I+3)) and (not R2V(4*I+2)) and (not R2V(4*I+1)) and ( R2V(4*I+0))) or ((not R2V(4*I+3)) and ( R2V(4*I+2)) and (not R2V(4*I+1)) and (not R2V(4*I+0))) or ((not R2V(4*I+3)) and ( R2V(4*I+2)) and ( R2V(4*I+1)) and ( R2V(4*I+0))) or (( R2V(4*I+3)) and (not R2V(4*I+2)) and ( R2V(4*I+1)) and (not R2V(4*I+0))) or (( R2V(4*I+3)) and ( R2V(4*I+2)) and (not R2V(4*I+1)) and ( R2V(4*I+0))); R4V(I*2 + 1) <= ((not R2V(4*I+3)) and (not R2V(4*I+2)) and ( R2V(4*I+1)) and (not R2V(4*I+0))) or ((not R2V(4*I+3)) and ( R2V(4*I+2)) and (not R2V(4*I+1)) and ( R2V(4*I+0))) or (( R2V(4*I+3)) and (not R2V(4*I+2)) and (not R2V(4*I+1)) and (not R2V(4*I+0))) or (( R2V(4*I+3)) and (not R2V(4*I+2)) and ( R2V(4*I+1)) and ( R2V(4*I+0))) or (( R2V(4*I+3)) and ( R2V(4*I+2)) and ( R2V(4*I+1)) and (not R2V(4*I+0))); end generate; -- The R80 logic: (2 LUTs) R8V(0) <= ((not R4V(3)) and (not R4V(2)) and (not R4V(1)) and ( R4V(0))) or ((not R4V(3)) and ( R4V(2)) and (not R4V(1)) and (not R4V(0))) or ((not R4V(3)) and ( R4V(2)) and ( R4V(1)) and ( R4V(0))) or (( R4V(3)) and (not R4V(2)) and ( R4V(1)) and (not R4V(0))) or (( R4V(3)) and ( R4V(2)) and (not R4V(1)) and ( R4V(0))); R8V(1) <= ((not R4V(3)) and (not R4V(2)) and ( R4V(1)) and (not R4V(0))) or ((not R4V(3)) and ( R4V(2)) and (not R4V(1)) and ( R4V(0))) or (( R4V(3)) and (not R4V(2)) and (not R4V(1)) and (not R4V(0))) or (( R4V(3)) and (not R4V(2)) and ( R4V(1)) and ( R4V(0))) or (( R4V(3)) and ( R4V(2)) and ( R4V(1)) and (not R4V(0))); -- P4 = R23 # R15 (2 LUTs) P4V(0) <= ((not R2V(7)) and (not R2V(6)) and (not R1V(11)) and ( R1V(10))) or ((not R2V(7)) and ( R2V(6)) and (not R1V(11)) and (not R1V(10))) or ((not R2V(7)) and ( R2V(6)) and ( R1V(11)) and ( R1V(10))) or (( R2V(7)) and (not R2V(6)) and ( R1V(11)) and (not R1V(10))) or (( R2V(7)) and ( R2V(6)) and (not R1V(11)) and ( R1V(10))); P4V(1) <= ((not R2V(7)) and (not R2V(6)) and ( R1V(11)) and (not R1V(10))) or ((not R2V(7)) and ( R2V(6)) and (not R1V(11)) and ( R1V(10))) or (( R2V(7)) and (not R2V(6)) and (not R1V(11)) and (not R1V(10))) or (( R2V(7)) and (not R2V(6)) and ( R1V(11)) and ( R1V(10))) or (( R2V(7)) and ( R2V(6)) and ( R1V(11)) and (not R1V(10))); -- P3 = R41 # R13 (2 LUTs) P3V(0) <= ((not R4V(3)) and (not R4V(2)) and (not R1V(7)) and ( R1V(6))) or ((not R4V(3)) and ( R4V(2)) and (not R1V(7)) and (not R1V(6))) or ((not R4V(3)) and ( R4V(2)) and ( R1V(7)) and ( R1V(6))) or (( R4V(3)) and (not R4V(2)) and ( R1V(7)) and (not R1V(6))) or (( R4V(3)) and ( R4V(2)) and (not R1V(7)) and ( R1V(6))); P3V(1) <= ((not R4V(3)) and (not R4V(2)) and ( R1V(7)) and (not R1V(6))) or ((not R4V(3)) and ( R4V(2)) and (not R1V(7)) and ( R1V(6))) or (( R4V(3)) and (not R4V(2)) and (not R1V(7)) and (not R1V(6))) or (( R4V(3)) and (not R4V(2)) and ( R1V(7)) and ( R1V(6))) or (( R4V(3)) and ( R4V(2)) and ( R1V(7)) and (not R1V(6))); -- P2 = R41 # R21 (2 LUTs) P2V(0) <= ((not R4V(3)) and (not R4V(2)) and (not R2V(3)) and ( R2V(2))) or ((not R4V(3)) and ( R4V(2)) and (not R2V(3)) and (not R2V(2))) or ((not R4V(3)) and ( R4V(2)) and ( R2V(3)) and ( R2V(2))) or (( R4V(3)) and (not R4V(2)) and ( R2V(3)) and (not R2V(2))) or (( R4V(3)) and ( R4V(2)) and (not R2V(3)) and ( R2V(2))); P2V(1) <= ((not R4V(3)) and (not R4V(2)) and ( R2V(3)) and (not R2V(2))) or ((not R4V(3)) and ( R4V(2)) and (not R2V(3)) and ( R2V(2))) or (( R4V(3)) and (not R4V(2)) and (not R2V(3)) and (not R2V(2))) or (( R4V(3)) and (not R4V(2)) and ( R2V(3)) and ( R2V(2))) or (( R4V(3)) and ( R4V(2)) and ( R2V(3)) and (not R2V(2))); -- P1 = R11 # R21 (2 LUTs) P1V(0) <= ((not R1V(3)) and (not R1V(2)) and (not R2V(3)) and ( R2V(2))) or ((not R1V(3)) and ( R1V(2)) and (not R2V(3)) and (not R2V(2))) or ((not R1V(3)) and ( R1V(2)) and ( R2V(3)) and ( R2V(2))) or (( R1V(3)) and (not R1V(2)) and ( R2V(3)) and (not R2V(2))) or (( R1V(3)) and ( R1V(2)) and (not R2V(3)) and ( R2V(2))); P1V(1) <= ((not R1V(3)) and (not R1V(2)) and ( R2V(3)) and (not R2V(2))) or ((not R1V(3)) and ( R1V(2)) and (not R2V(3)) and ( R2V(2))) or (( R1V(3)) and (not R1V(2)) and (not R2V(3)) and (not R2V(2))) or (( R1V(3)) and (not R1V(2)) and ( R2V(3)) and ( R2V(2))) or (( R1V(3)) and ( R1V(2)) and ( R2V(3)) and (not R2V(2))); -- P0 = R41 # PR1 (2 LUTs) P0V(0) <= ((not R4V(3)) and (not R4V(2)) and (not P1V(1)) and ( P1V(0))) or ((not R4V(3)) and ( R4V(2)) and (not P1V(1)) and (not P1V(0))) or ((not R4V(3)) and ( R4V(2)) and ( P1V(1)) and ( P1V(0))) or (( R4V(3)) and (not R4V(2)) and ( P1V(1)) and (not P1V(0))) or (( R4V(3)) and ( R4V(2)) and (not P1V(1)) and ( P1V(0))); P0V(1) <= ((not R4V(3)) and (not R4V(2)) and ( P1V(1)) and (not P1V(0))) or ((not R4V(3)) and ( R4V(2)) and (not P1V(1)) and ( P1V(0))) or (( R4V(3)) and (not R4V(2)) and (not P1V(1)) and (not P1V(0))) or (( R4V(3)) and (not R4V(2)) and ( P1V(1)) and ( P1V(0))) or (( R4V(3)) and ( R4V(2)) and ( P1V(1)) and (not P1V(0))); -- Assemble the partial remainders into the inputs for the quotient -- calculations: PRV(15 downto 0) -- Remainder of <= R1V(15 downto 14) -- R17 AIN[31:28] & R2V( 7 downto 6) -- R23 AIN[31:24] & P4V( 1 downto 0) -- P4 AIN[31:20] & R4V( 3 downto 2) -- R41 AIN[31:16] & P3V( 1 downto 0) -- P3 AIN[31:12] & P2V( 1 downto 0) -- P2 AIN[31:8] & P0V( 1 downto 0) -- P0 AIN[31:4] & R8V( 1 downto 0); -- R8 AIN[31:0] -- The highest quotient block has no remainder coming in, -- so compute it directly: (3 LUTs) with AIN(31 downto 28) select QOUTD(30 downto 28) <= "000" when "0000" | "0001" | "0010", "001" when "0011" | "0100" | "0101", "010" when "0110" | "0111" | "1000", "011" when "1001" | "1010" | "1011", "100" when "1100" | "1101" | "1110", "101" when others; -- 0000 00 -- 0001 00 -- 0010 00 -- 0011 01 -- 0100 01 -- 0101 01 -- 0110 10 -- 0111 10 -- 1000 10 -- 1001 11 -- 1010 11 -- 1011 11 -- Compute the quotient in blocks of 4 bits Q: for I in 0 to 6 generate -- Top two bits are computed directly from the carry-in and AIN QOUTD(4*I + 2) <= ((not PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and ( AIN(4*I+2))) or ((not PRV(2*I+3)) and ( PRV(2*I+2)) and (not AIN(4*I+3)) and (not AIN(4*I+2))) or ((not PRV(2*I+3)) and ( PRV(2*I+2)) and (not AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and (not AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and (not AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and ( AIN(4*I+2))); QOUTD(4*I + 3) <= ((not PRV(2*I+3)) and ( PRV(2*I+2)) and ( AIN(4*I+3)) and (not AIN(4*I+2))) or ((not PRV(2*I+3)) and ( PRV(2*I+2)) and ( AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and (not AIN(4*I+3)) and (not AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and (not AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and (not AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and ( AIN(4*I+2))); -- I need to compute the remainder out of the top two bits for the lower two bits X0V(2*I + 0) <= ((not PRV(2*I+3)) and (not PRV(2*I+2)) and (not AIN(4*I+3)) and ( AIN(4*I+2))) or ((not PRV(2*I+3)) and ( PRV(2*I+2)) and (not AIN(4*I+3)) and (not AIN(4*I+2))) or ((not PRV(2*I+3)) and ( PRV(2*I+2)) and ( AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and (not AIN(4*I+2))) or (( PRV(2*I+3)) and ( PRV(2*I+2)) and (not AIN(4*I+3)) and ( AIN(4*I+2))); X0V(2*I + 1) <= ((not PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and (not AIN(4*I+2))) or ((not PRV(2*I+3)) and ( PRV(2*I+2)) and (not AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and (not AIN(4*I+3)) and (not AIN(4*I+2))) or (( PRV(2*I+3)) and (not PRV(2*I+2)) and ( AIN(4*I+3)) and ( AIN(4*I+2))) or (( PRV(2*I+3)) and ( PRV(2*I+2)) and ( AIN(4*I+3)) and (not AIN(4*I+2))); -- Now I can compute the lowest two bits: QOUTD(4*I + 0) <= ((not X0V(2*I+1)) and (not X0V(2*I+0)) and ( AIN(4*I+1)) and ( AIN(4*I+0))) or ((not X0V(2*I+1)) and ( X0V(2*I+0)) and (not AIN(4*I+1)) and (not AIN(4*I+0))) or ((not X0V(2*I+1)) and ( X0V(2*I+0)) and (not AIN(4*I+1)) and ( AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and (not AIN(4*I+1)) and ( AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and ( AIN(4*I+1)) and (not AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and ( AIN(4*I+1)) and ( AIN(4*I+0))); QOUTD(4*I + 1) <= ((not X0V(2*I+1)) and ( X0V(2*I+0)) and ( AIN(4*I+1)) and (not AIN(4*I+0))) or ((not X0V(2*I+1)) and ( X0V(2*I+0)) and ( AIN(4*I+1)) and ( AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and (not AIN(4*I+1)) and (not AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and (not AIN(4*I+1)) and ( AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and ( AIN(4*I+1)) and (not AIN(4*I+0))) or (( X0V(2*I+1)) and (not X0V(2*I+0)) and ( AIN(4*I+1)) and ( AIN(4*I+0))); end generate; -- The following flip-flop implication of all the partial logic is included only -- to prevent the synthesis tool from optimizing out the beautiful logic I've -- created. FKD(53 downto 0) <= R1V & R2V & R4V & R8V & P4V & P3V & P2V & P1V & P0V & X0V; -- Register the remainder REMD(1 downto 0) <= PRV(1 downto 0); process (CLK) begin if CLK'event and CLK='1' then FKQ <= FKD; QOUTQ <= QOUTD; REMQ <= REMD; end if; end process; -- Output assignment: REMOUT <= REMQ(1 downto 0); QOUT <= QOUTQ(30 downto 0); -- Test output (unused, but included for synthesis restriction). TEST <= FKQ; end DIV32_3_arch; -- Posted from firewall.terabeam.com [216.137.15.2] via Mailgate.ORG Server - http://www.Mailgate.ORGArticle: 37672
> Anybody can throw code > together and get poor performance. Wait...this isn't a Microsoft news group, is it? ;-)Article: 37673
Bryan, Reminds me of the Dilbert Cartoon where they are telling tales of their early programming years... "I remember using assembly code..." "That is nothing, I remember using 1's and 0's...." "You had zeroes? Wow, we had to use 'lower case l's and upper case 'ohs'..." "Bucnh of babies, I only had 1's!" Why we as engineers would enjoy pain, and brag about it still amazes me. A design that is well architected, self documented, commented, and reliable is more important to many customers. I prefer to throw all of my energies into supporting those designs (in hdl's) which now account for 99% of what is being done out there. Austin Bryan wrote: > So lets talk controversial.... > > If Lucent can support hard macros in Epic with hard routing, then why can't > Xilinx. My application requires it and Xilinx doesn't support it in FPGA > editor(which was programmed by the same softies as Epic). Oh, I remember > why they don't support it. Because nobody cares about designs that push the > limitations of FPGAs. Because everybody else that is making designs for > Xilinx parts is still in kindergarten finger painting with verilog and hdl. > Ha, I didn't get my EE degree to be a soft weirdo. Anybody can throw code > together and get poor performance. > > flame away kindergarten kids > > Bryan > > "Peter Alfke" <peter.alfke@xilinx.com> wrote in message > news:3C1F8AEC.BFD2E067@xilinx.com... > > This is a friendly and helpful newsgroup, but let's make sure that it does > not > > get abused. > > Lots of textbooks explain how to divide by a power of 2, where the > remainder is, > > and how you sign-extend the MSB. Explaining that is not the purpose of > this > > newsgroup. > > > > Let's use our "bandwidth" for more complex and perhaps controversial > questions > > that are not explained in textbooks and data books. > > > > Peter Alfke, Xilinx Applications > > > >Article: 37674
Hello Bryan, FPGA Editor and the other implementation tools do support routed hard macros. Hard macros aren't widely used because of the limitations in the timing analysis tools in dealing with the macros, but the support is there. Regards, Bret Wade Xilinx Product Applications Bryan wrote: > So lets talk controversial.... > > If Lucent can support hard macros in Epic with hard routing, then why can't > Xilinx. My application requires it and Xilinx doesn't support it in FPGA > editor(which was programmed by the same softies as Epic). Oh, I remember > why they don't support it. Because nobody cares about designs that push the > limitations of FPGAs. Because everybody else that is making designs for > Xilinx parts is still in kindergarten finger painting with verilog and hdl. > Ha, I didn't get my EE degree to be a soft weirdo. Anybody can throw code > together and get poor performance. > > flame away kindergarten kids > > Bryan > > "Peter Alfke" <peter.alfke@xilinx.com> wrote in message > news:3C1F8AEC.BFD2E067@xilinx.com... > > This is a friendly and helpful newsgroup, but let's make sure that it does > not > > get abused. > > Lots of textbooks explain how to divide by a power of 2, where the > remainder is, > > and how you sign-extend the MSB. Explaining that is not the purpose of > this > > newsgroup. > > > > Let's use our "bandwidth" for more complex and perhaps controversial > questions > > that are not explained in textbooks and data books. > > > > Peter Alfke, Xilinx Applications > > > >
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z