Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
"David Miller" <spam@quartz.net.nz> wrote in message news:3C20222D.4040601@quartz.net.nz... > M. Ramirez's question still holds good -- is there ever a reason not to > pack flops into IOBs? David, Even if there was a reason, and there always is, wouldn't it be good for the default to be "Inputs and Outputs." Us crusty guys know better, but how many newbies fire up the tools not knowing EVERY detail and overlooking this one? Since IOB flip flops are a freebie and improve offset timing, why not use them as the default? Just an suggestion. SimonArticle: 37726
Hello Bryan, It sounds as though you're running into some limitations with manual routing in FPGA Editor rather than a lack of support for routed hard macros. As David mentioned, you may have more success by reusing automatic routing, either by using routed hard macros or the new directed routing feature. The directed routing feature has the advantage of not introducing any limitation wrt timing analysis. If you feel that manual routing in FED is broken, I suggest that you contact the hotline and ask them to log a CR for you. Regards, Bret Wade Xilinx Product Applications Bryan wrote: > Ok, this is great. Two other times I posted to this group about making hard > macros and got no response. The low road paid off it seems. I really want > to get these hard macros with locked routing to work. What I see when I try > to make a hard macro is that I cannot manually route the nets in editor. I > want to hand route the clock signal on local routing and hand route an > enable. The first thing that happens is I start routing the clock through a > switchbox from a normal IBUF, works fine. Then I try to re-select the input > sight to the switchbox to make a split in the routing. I cannot select the > input bubble again and if I just select the net and a new destination out of > the switchbox it complains. I have a routed design open in editor right > next to the one I am hand routing. Since I couldn't get this to work, I > decided to first see if I could route the line by hand like the router did. > I know I am selecting usable sights because I am cheating of a routed > design. So how do I re-select that bubble to continue hand routing? > > From what I have seen on the group most people are letting the tool route > the "macro" then going into editor and turning the ncd to a macro. Then > lock routing. What I have been doing is placement only and then open the > ncd to hand route my couple nets. What I am building into a macro is a 16 > bit FIFO, I have 16 of these FIFOs in the design and each one contains IOB > latches. Since they contain IOB latches none of them have the identical IOB > placement because of prohibit sites. If I hand route I can keep al of the > clock tree in the logic portion the same and only change which IOBs it goes > to for each macro. > > Any help greatly appreciated, but if you can answer this why can't my Xilinx > FAE? Because I used to support this stuff for NeoCAD. > > > Bryan > > "Bret Wade" <bret.wade@xilinx.com> wrote in message > news:3C1FDBC2.A88FF9B8@xilinx.com... > > Hello Bryan, > > > > FPGA Editor and the other implementation tools do support routed hard > macros. > > Hard macros aren't widely used because of the limitations in the timing > analysis > > tools in dealing with the macros, but the support is there. > > > > Regards, > > Bret Wade > > Xilinx Product Applications > > > > Bryan wrote: > > > > > So lets talk controversial.... > > > > > > If Lucent can support hard macros in Epic with hard routing, then why > can't > > > Xilinx. My application requires it and Xilinx doesn't support it in > FPGA > > > editor(which was programmed by the same softies as Epic). Oh, I > remember > > > why they don't support it. Because nobody cares about designs that push > the > > > limitations of FPGAs. Because everybody else that is making designs for > > > Xilinx parts is still in kindergarten finger painting with verilog and > hdl. > > > Ha, I didn't get my EE degree to be a soft weirdo. Anybody can throw > code > > > together and get poor performance. > > > > > > flame away kindergarten kids > > > > > > Bryan > > > > > > "Peter Alfke" <peter.alfke@xilinx.com> wrote in message > > > news:3C1F8AEC.BFD2E067@xilinx.com... > > > > This is a friendly and helpful newsgroup, but let's make sure that it > does > > > not > > > > get abused. > > > > Lots of textbooks explain how to divide by a power of 2, where the > > > remainder is, > > > > and how you sign-extend the MSB. Explaining that is not the purpose of > > > this > > > > newsgroup. > > > > > > > > Let's use our "bandwidth" for more complex and perhaps controversial > > > questions > > > > that are not explained in textbooks and data books. > > > > > > > > Peter Alfke, Xilinx Applications > > > > > > > > > >Article: 37727
David Miller wrote: > > ALWAYS > > > >>want my designs to use IOB flip flops if possible. It seems to me that > > > That's what you get for using Design Mangler...er...Manager ;-) > > heh. I find that make does a fair job of managing builds. But then, I > always did find CLIs more user friendly than GUIs. > > Even if you invoke map from the commandline or means other than through > DM, packing flops into I/Os is not done unless the -pr flag is supplied. > So I suppose DM is following the defaults of map. > > M. Ramirez's question still holds good -- is there ever a reason not to > pack flops into IOBs? > I think that packing registers is not the default map option because the expectation is that registers will have IOB=TRUE|FALSE attributes applied to them by the front end tool. This attribute takes precedence over the -pr map switch and allows for individual control of registers. Regards, Bret Wade Xilinx Product ApplicationsArticle: 37728
Hi, Have you considered changing your logic design? I can't say for sure from reading your timing report, but it looks like you are using FRAME# and IRDY#, through some combinational logic, to enable the output flip flops for the AD bus. I can imagine several places where you would do this; one is the logic that clocks out the address for your initiator when the bus is idle and you have GNT#. This would clock out the address for the address phase(s). You can eliminate this path entirely by continually clocking out an address, even if it is invalid, because that address will not be driven on the bus (i.e. it doesn't matter what is in the output flops if the tristate lines are high...) I prefer to think of this problem (the one of the output clock enables for the data path) as "When do I want to HALT outgoing data?" instead of trying to solve the more natural problem of when you should actually be enabling it. Use the fact that the datapath will be tristated to your advantage. You will still have to solve the problem of when to turn off the tristates, though. Eric Kevin Brace wrote: > > Hi, I will like to know if someone knows the strategies on how to reduce > routing (net) delays for Spartan-II. > So far, I treated synthesis tool(XST)/Map/Par as a blackbox, but because > my design (a PCI IP core) was not meeting Tsu (Tsu < 7ns), I started to > take a closer look of how LUTs are placed on the FPGA. > Using Floorplanner, I saw the LUTs being placed all over the FPGA, so I > decided to hand place the LUTs using UCF flow. > That was the most effective thing I did to reduce interconnect delay > (reduced the worst interconnect delay by about 2.7 ns (11 ns down to 8.3 > ns)), but unfortunately, I still have to reduce the interconnect delay > by 1.3 ns (worst Tsu currently at 8.3 ns). > Basically, I have two input signals, FRAME# and IRDY# that are not > meeting timings. > Here are the two of the worst violators for FRAME# and IRDY#, > respectively. > > ________________________________________________________________________________ > > ================================================================================ > Timing constraint: COMP "frame_n" OFFSET = IN 7 nS BEFORE COMP "clk" ; > > 503 items analyzed, 61 timing errors detected. > Minimum allowable offset is 8.115ns. > > -------------------------------------------------------------------------------- > Slack: -1.115ns (requirement - (data path - clock path > - clock arrival)) > Source: frame_n > Destination: PCI_IP_Core_Instance_ad_Port_2 > Destination Clock: clk_BUFGP rising at 0.000ns > Requirement: 7.000ns > Data Path Delay: 10.556ns (Levels of Logic = 6) > Clock Path Delay: 2.441ns (Levels of Logic = 2) > Timing Improvement Wizard > Data Path: frame_n to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tiopi 1.224 frame_n > frame_n_IBUF > net (fanout=45) 0.591 frame_n_IBUF > Tilo 0.653 PCI_IP_Core_Instance_I_25_LUT_7 > net (fanout=3) 0.683 N21918 > Tbxx 0.981 PCI_IP_Core_Instance_I_XXL_1357_1 > net (fanout=15) 2.352 PCI_IP_Core_Instance_I_XXL_1357_1 > Tilo 0.653 PCI_IP_Core_Instance_I_125_LUT_17 > net (fanout=1) 0.749 PCI_IP_Core_Instance_N3059 > Tilo 0.653 PCI_IP_Core_Instance_I__n0055 > net (fanout=1) 0.809 PCI_IP_Core_Instance_N3069 > Tioock 1.208 PCI_IP_Core_Instance_ad_Port_2 > ---------------------------- ------------------------------ > Total 10.556ns (5.372ns logic, 5.184ns route) > (50.9% logic, 49.1% route) > > Clock Path: clk to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tgpio 1.082 clk > clk_BUFGP/IBUFG > net (fanout=1) 0.007 clk_BUFGP/IBUFG > Tgio 0.773 clk_BUFGP/BUFG > net (fanout=423) 0.579 clk_BUFGP > ---------------------------- ------------------------------ > Total 2.441ns (1.855ns logic, 0.586ns route) > (76.0% logic, 24.0% route) > > -------------------------------------------------------------------------------- > > ================================================================================ > Timing constraint: COMP "irdy_n" OFFSET = IN 7 nS BEFORE COMP "clk" ; > > 698 items analyzed, 74 timing errors detected. > Minimum allowable offset is 8.290ns. > > -------------------------------------------------------------------------------- > Slack: -1.290ns (requirement - (data path - clock path > - clock arrival)) > Source: irdy_n > Destination: PCI_IP_Core_Instance_ad_Port_2 > Destination Clock: clk_BUFGP rising at 0.000ns > Requirement: 7.000ns > Data Path Delay: 10.731ns (Levels of Logic = 6) > Clock Path Delay: 2.441ns (Levels of Logic = 2) > Timing Improvement Wizard > Data Path: irdy_n to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tiopi 1.224 irdy_n > irdy_n_IBUF > net (fanout=138) 0.766 irdy_n_IBUF > Tilo 0.653 PCI_IP_Core_Instance_I_25_LUT_7 > net (fanout=3) 0.683 N21918 > Tbxx 0.981 PCI_IP_Core_Instance_I_XXL_1357_1 > net (fanout=15) 2.352 PCI_IP_Core_Instance_I_XXL_1357_1 > Tilo 0.653 PCI_IP_Core_Instance_I_125_LUT_17 > net (fanout=1) 0.749 PCI_IP_Core_Instance_N3059 > Tilo 0.653 PCI_IP_Core_Instance_I__n0055 > net (fanout=1) 0.809 PCI_IP_Core_Instance_N3069 > Tioock 1.208 PCI_IP_Core_Instance_ad_Port_2 > ---------------------------- ------------------------------ > Total 10.731ns (5.372ns logic, 5.359ns route) > (50.1% logic, 49.9% route) > > Clock Path: clk to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tgpio 1.082 clk > clk_BUFGP/IBUFG > net (fanout=1) 0.007 clk_BUFGP/IBUFG > Tgio 0.773 clk_BUFGP/BUFG > net (fanout=423) 0.579 clk_BUFGP > ---------------------------- ------------------------------ > Total 2.441ns (1.855ns logic, 0.586ns route) > (76.0% logic, 24.0% route) > > -------------------------------------------------------------------------------- > > Timing summary: > --------------- > > Timing errors: 135 Score: 55289 > > Constraints cover 27511 paths, 0 nets, and 4835 connections (92.1% > coverage) > > ________________________________________________________________________________ > > Locations of various resources: > > FRAME#: pin 23 > IRDY#: pin 24 > AD[2]: pin 62 > PCI_IP_Core_Instance_I_25_LUT_7: CLB_R12C1.s1 > PCI_IP_Core_Instance_I_XXL_1357_1: CLB_R12C2 > PCI_IP_Core_Instance_I_125_LUT_17: CLB_R23C9.s0 > PCI_IP_Core_Instance_I__n0055: CLB_R24C9.s0 > > Input signals other than FRAME# and IRDY# are all meeting Tsu < 7 ns > requirement, and because I now figured out how to use IOB FFs, I can > easily meet Tval < 11 ns (Tco) for all output signals. > I am using Xilinx ISE WebPack 4.1 (which doesn't come with FPGA Editor), > and the PCI IP core is written in Verilog. > The device I am targeting is Xilinx Spartan-II 150K system gate speed > grade -5 part (XC2S150-5CPQ208), and I did meet all 33MHz PCI timings > with Spartan-II 150K system gate speed grade -6 part (XC2S150-6CPQ208) > when I resynthesized the PCI IP core for speed grade -6 part, and > basically reused the same UCF file with the floorplan (I had to make > small modifications to the UCF file because some of the LUT names > changed). > The reason I really care about Xilinx Spartan-II 150K system gate speed > grade -5 part is because that is the chip that is on the PCI prototype > board of Insight Electronics Spartan-II Development Kit. > Yes, I wish the PCI prototype board came with speed grade -6 . . . > Because I want the PCI IP core to be portable across different platforms > (most notably Xilinx and Altera FPGAs), I am not really interested in > making any vendor specific modification to my Verilog RTL code, but I > won't mind using various tricks in the .UCF file (for Xilinx) or .ACF > file (I believe that is the Altera equivalent of Xilinx .UCF file). > Here are some solutions I came up with. > > 1) Reduce the signal fanout (Currently at 35 globally, but FRAME# and > IRDY#'s fanout are 200. What number should I reduce the global fanout > to?). > > 2) Use USELOWSKEWLINES in a UCF file (already tried on some long > routings, but didn't seem to help. I will try to play around with this > option a little more with different signals.). > > 3) Floorplan all the LUTs and FFs on the FPGA (currently, I only > floorplanned the LUTs that violated Tsu, and most of them take inputs > from FRAME# and IRDY#.). > > 4) Use Guide file Leverage mode in Map and Par. > > 5) Try routing my design 2000 times (That will take several days . . . I > once routed my design about 20 times. After routing my design 20 times, > Par seems to get stuck in certain Timing Score range beyond 20 > iterations.). > > 6) Pay for ISE Foundation 4.1 (I don't want to pay for tools because I > am poor), and use FPGA Editor (I wish ISE WebPack came with FPGA > Editor.). At least from FPGA Editor, I can see how the signals are > actually getting routed. > > 7) Use a different synthesis tool other than XST (I am poor, so I doubt > that I can afford.). > > I will like to hear from anyone who can comment on the solutions I just > wrote, or has other suggestions on what I can do to reduce the delays to > meet 33MHz PCI's Tsu < 7 ns requirement. > > Thanks, > > Kevin Brace (don't respond to me directly, respond within the newsgroup) > > P.S. Considering that I am struggling to meet 33MHz PCI timings with > Spartan-II speed grade -5, how come Xilinx meet 66MHz PCI timings on > Virtex/Spartan-II speed grade -6? (I can only barely meet 33MHz PCI > timings with Spartan-II speed grade -6 using floorplanner.) > Is it possible to move a signal through a input pin like FRAME# and > IRDY# (pin 23 and pin 24 respectively for Spartan-II PQ208), go through > a few levels of LUTs, and reach far away IOB output FF and tri-state > control FF like pin 67 (AD[0]) or pin 203 (AD[31]) in 5 ns? (3 ns + 1.9 > to 2 ns natural clock skew = 4.9 ns to 5.0 ns realistic Tsu) > Can a signal move that fast on Virtex/Spartan-II speed grade -6? (I sort > of doubt from my experience.) > I know that Xilinx uses the special IRDY and TRDY pin in LogiCORE PCI, > but that won't seem to help FRAME#, since FRAME# has to be sampled > unregistered to determine an end of burst transfer. > What kind of tricks is Xilinx using in their LogiCORE PCI other than the > special IRDY and TRDY pin? > Does anyone know?Article: 37729
I haven't seen this algorithm published. If it is original, I'm throwing it into the public domain. -- Efficient fall through multiplier in Xilinx Spartan2, Virtex, -- VirtexE. (Will work in Virtex2, but they have dedicated -- multipliers so they probably don't need this.) -- -- This fall through multiplier gets 3 bits per initial adder -- rather than the usual 2 bits. This is accomplished by taking -- advantage of under utilized adders in a standard multiplier. -- -- This is an efficient multiplier when the multiplier has -- 3n - 1 bits, with n>1. Where it really rocks is when the -- the multiplier has (3n-1)*2^m bits. -- -- This is in contrast to the usual Xilinx Virtex multiplier -- which is relatively efficient when the multiplier has -- 2*2^m bits. -- -- For large enough multiplies, this algorithm gets more and -- more efficient, compared to the usual Xilinx multiplier. -- -- While a standard multiplier uses the MULT_AND logic well, the -- stages that add up the partial product results are simple adders -- built from LUT2s. Any time you see a big array of LUTs used with -- less than all four inputs needed you have to wonder if there's -- a more efficient way of packing the logic in. -- -- A note on the usual Xilinx multiplier. (Those skilled in the art -- should skip this section.) -- -- The usual multiplier uses two bits per partial product. The least -- significant partial product produces one of {0M, 1M, 2M, 3M}, while the -- next least significant produces one of {0M, 4M, 8M, 12M}. Adding -- together these two results produces any multiple of M from 0M to 15M: -- -- 0M + 0M = 0M -- 0M + 1M = 1M -- 0M + 2M = 2M -- 0M + 3M = 3M -- 4M + 0M = 4M -- 4M + 1M = 5M -- 4M + 2M = 6M -- 4M + 3M = 7M -- ... -- 12M + 2M = 14M -- 12M + 3M = 15M -- -- -- Instead of keeping all my numbers as positive (or 2's complement -- negative) numbers, I save bits if I allow negative numbers and -- keep a single bit that indicates that the result is to be interpreted -- as a negative number. This is how Seymour Cray did his arithmetic, -- but I don't know if he used this internally to his multipliers. -- But if I assume that my single column of slices is only going to -- give me one of four possible multiples of M, I have a problem. If -- I choose the four values to be {0M, 1M, 2M, 3M}, then I only get -- seven values when I negate them as -0M = 0M. But for a 3-bit -- coded multiplier I'm going to need 2^3 = 8 values from each slice. -- If I use {1M, 2M, 3M, 4M}, then I miss zero. -- -- The solution is to use two sign bits, one for positive numbers, the -- other for negative. Let M = 5, so the nine possible values are as -- follows: (Note that I only need 8 of these nine values.) -- -- P N Mult | Value -- - - ---- + ----- -- 0 1 4 | -20 (i.e. -4*M) -- 0 1 3 | -15 -- 0 1 2 | -10 -- 0 1 1 | - 5 -- 0 0 X | 0 -- 1 0 1 | 5 -- 1 0 2 | 10 -- 1 0 3 | 15 -- 1 0 4 | 20 (i.e. 4*M) -- -- This would be exactly what the doctor ordered if the multiplier -- were in a base slightly different from the octal. With octal, the -- eight values that a digit can take are the familiar {0,1, ... 7}. -- With this unusual base, the 8 values that a digit can take are -- instead {-4,-3,-2,-1,0,1,2,3}. (I choose to keep these rather than -- -3 through 4 because it saves a LUT somewhere.) -- -- So I need a base conversion between base 8 and base, well, I'll -- call it base 8-4. Here's how numbers are interpreted in the -- two bases: -- -- Base 8: A = (A0 ) + (A1 )*8 + (A2 )*8^2 + ... -- Base 8-4: B = (B0-4) + (B1-4)*8 + (B2-4)*8^2 + ... -- -- From this it is obvious that to convert the number "A" in base 8 -- to base 8-4, I need merely add the octal constant o444444... to "A". -- This perfectly converts it to the corresponding (i.e. carrying the -- same numerical value) number in base 8-4. -- -- This conversion is very convenient when the multiplier has a lot of -- bits, but it isn't needed for relatively short multipliers. In -- particular, a multiplier of n x 5 would do well to avoid performing -- the base conversion explicitly. -- -- -- After performing the base conversion, I take each digit from B -- (where B = A + o44...44. = A + "100100100...100100") and use -- it to create a single partial product. For an n x (3m-1) multiply, -- I'll end up with m partial products. -- -- I can't just add up the partial products because they're not in -- the usual format for 2's complement arithmetic. I'll have to add -- extra logic to the adder stages in order to handle the signed -- values being added. It turns out that there is exactly enough -- freedom in a Xilinx slice in order to handle these explicitly -- signed numbers. -- -- With two mode bits, I can choose 4 functions in a Xilinx arithmetic -- slice. The table to add two signed numbers (each with "N" and "P" -- bits, and each with a partial sum in the set 1M to 4M) is fairly -- complicated. Let "S" be the higher precision partial product, and -- "T" be the lower precision number. -- -- Rather than be confusing, I'll denote the "N" bit of "S" by "S.N", -- same with the "P" bit, and I'll denote the unsigned vector part -- of "S" by "S.V". Same with "T". Then the function to add "S" and -- "T" is as follows: (Note that since "S" is higher in signficance -- than "T", it follows that "S.V" is larger, as an unsigned number, -- than "T.V", except if "S" is zero, in which case "S.V" is a don't -- care. For this reason, all the subtractions in the following -- table result in unsigned integers.) -- -- "S" "T" "S+T" -- --------- --------- ------------------ -- S P N V T P N V S+T P N V -- -- - - - -- - - - ------ - - --- -- -A 0 1 A -B 0 1 B -(A+B) 0 1 A+B -- -A 0 1 A 0 0 0 X -(A ) 0 1 A -- -A 0 1 A B 1 0 B -(A-B) 0 1 A-B -- 0 0 0 A -B 0 1 B -( B) 0 1 B -- 0 0 0 A 0 0 0 X 0 0 0 X -- 0 0 0 A B 1 0 B ( B) 1 0 B -- A 1 0 A -B 0 1 B (A-B) 1 0 A-B -- A 1 0 A 0 0 0 X (A ) 1 0 A -- A 1 0 A B 1 0 B (A_B) 1 0 A+B -- -- From this, it is clear that "(S+T).N" and "(S+T).P" are -- simple (4-LUT) functions of "S.N", "S.P", "T.N" and "T.P". -- It's also clear that "(S+T).V" is computed by one of the -- four functions {A+B, A-B, A, B}. It turns out that these -- four arithmetic functions exactly fit in a single slice. -- -- It's all very well that I keep the internal partial sums -- in this odd notation, but it won't do to deliver that to -- the customer. So how do I compute the final result? -- -- First of all, the final result can't have the "N" bit set, -- because this is an unsigned multiply. If it has the "P" bit -- set, then the correct product shows up on "(S+T).V" and I'm -- done. The only hairy case is when the "P" bit is low. In -- that case, the final result will be supposed to be zero, but -- "(S+T).V" will be an "X". For this reason, I have to connect -- up the final "P" result to the synchronous reset pin of the -- final register in such a way that when "P" is zero, the -- final result register is held to zero. -- -- This comment has gone on long enough that I'm including it -- as a separate note rather than with the VHDL. I'll add VHDL -- code for a sample multipliers as replies to this note, but -- these aren't easy to build, so give me some time. Carl -- Posted from firewall.terabeam.com [216.137.15.2] via Mailgate.ORG Server - http://www.Mailgate.ORGArticle: 37730
Carl Brannen wrote: > > Design reports: Leo gives 55 LCs on altera 20k -- 193 MHz !! -- Mike TreselerArticle: 37731
This almost sounds like a homework question. :P Err, wouldn't you get more progress by coming up with some code and calculating worst case cycle counts ? Cheers, Rupert AAP3 <aams@dr.com> wrote in message news:wRXT7.584$B47.961868@typhoon.columbus.rr.com... > Hi..to all > I wrote some functions for a CDMA receiver and I want to find the number of > MIPS required by each function. How do I calculate it? > and which is more accurate measure, MIPS or MOPS? > More info: > data rate 2Mbps. > system clock 50MHz. > 4 time over sampling. > 16 Spreading factor. > > Thanks.Article: 37732
> > Hi Simon, > > > > That's what you get for using Design Mangler...er...Manager ;-) > > > > Austin > > Hi Austin, > Is Max Pus II any better? > Simon Hi Simon, Have you been giving flying lessons to any "suspicious" characters recently? I haven't used MaxPlus II in probably 6 or more years, so I couldn't tell you... Regards and Happy Holidays, AustinArticle: 37733
> > > ALWAYS > > > > > >>want my designs to use IOB flip flops if possible. It seems to me that > > > > > That's what you get for using Design Mangler...er...Manager ;-) > > > > heh. I find that make does a fair job of managing builds. But then, I > > always did find CLIs more user friendly than GUIs. > > > > Even if you invoke map from the commandline or means other than through > > DM, packing flops into I/Os is not done unless the -pr flag is supplied. > > So I suppose DM is following the defaults of map. > > > > M. Ramirez's question still holds good -- is there ever a reason not to > > pack flops into IOBs? > > > > I think that packing registers is not the default map option because the > expectation is that registers will have IOB=TRUE|FALSE attributes applied to > them by the front end tool. This attribute takes precedence over the -pr map > switch and allows for individual control of registers. > > Regards, > Bret Wade > Xilinx Product Applications Bret, I don't know that that is true. Even if Synplicity has that checked, the Xilinx tools STILL need the "-pr b" to be added to the mapper from what I remember. Regards, AustinArticle: 37734
Hi Austin, > If I see hdl code, at least I can see where it is going, even if it is written > badly. For me, that is not true. I have to wade around pages and pages of text....where with a schematic, I can pick up what's going on almost instantly. Schematics offer, if done right that is, a built-in block diagram...which can not be done with text files very easily. The data flow is FAR easier to pick out in a schematic than in HDL, and control logic may or may not be easier to "understand" in HDL...it depends on how it's done. > Nice thing about software is that people have figured out how to manage it, and > document it. Why is that any different than schematics? > If I examine a design, from top to bottom, I can make a determination of the > quality of the design by examining the hdl code. It is possible, but more > difficult to see what is going on in schematic. I believe the exact opposite. > As a > technical manager, code review is one tool that should be used to make sure the > project is on track, following the rules, and has a higher likelihood of > success. And you've never seen/done a schematic review? I believe schematics are FAR easier to review than text files are. Anyway, design reviews are typically NOT the source files, but the architecture...it's rare that one brings source files to a design review and gives a copy to everyone in the room, and people just sit around flipping through hundreds of pages of text discussing constructs... I think you should attend my lecture in the spring on mixed design entry for FPGA design ;-) Regards, AustinArticle: 37735
"Kevin Brace" <kevinbraceusenetkillspam@hotmail.com.killspam> schrieb im Newsbeitrag news:3C202A1A.B51A40DD@hotmail.com.killspam... > IRDY# (pin 23 and pin 24 respectively for Spartan-II PQ208), go through > a few levels of LUTs, and reach far away IOB output FF and tri-state ^^^^^^^^^^^^^^^^^^^^^^ I think THATs a good point to start optimization. HOW? Ok, you have a decoding logic with lets say four levels of logic. IRDY# enter the FPGA, runs through the logic and reaches the input of an IOB FF. Now the propagation time through 4 levels of logic is too long, what to do? Lets say our logic has 10 input signals. One of them is IRDY. So this decoder can be repesented by a 1024 entry ROM right? If I dont have a 1024 ENTRY ROM, You could use 2 512 entrys ROMS. The output is MUXed with IRDY#. So IRDY# has only to travel through 1 level of logic (timing analyzer calls this 3 levels, since it counts clock_2_out and setup as seperate levels) Got my point? -- MfG FalkArticle: 37736
"Austin Franklin" <austin@dark87room.com> wrote in message news:u21n7gqtl9on53@corp.supernews.com... > > > Hi Simon, > > > > > > That's what you get for using Design Mangler...er...Manager ;-) > > > > > > Austin > > > > Hi Austin, > > Is Max Pus II any better? > > Simon > > Hi Simon, > > Have you been giving flying lessons to any "suspicious" characters recently? > > I haven't used MaxPlus II in probably 6 or more years, so I couldn't tell > you... > > Regards and Happy Holidays, > > Austin Austin, It was a joke, Austin, it was a joke (similar to yours by the way) You and your wire and kids also have a Merry Christmas and Happy New Year. Simon (In Beautiful 78 degree Florida)Article: 37737
"Bret Wade" <bret.wade@xilinx.com> wrote in message news:3C20D7F1.4AB91218@xilinx.com... > I think that packing registers is not the default map option because the > expectation is that registers will have IOB=TRUE|FALSE attributes applied to > them by the front end tool. This attribute takes precedence over the -pr map > switch and allows for individual control of registers. > > Regards, > Bret Wade > Xilinx Product Applications Bret, Every company where I have worked, we've never used the above mentioned attribute. Simon RamirezArticle: 37738
Murali Jayapala <mjayapal@esat.kuleuven.ac.be> writes: >I guess the first step is to check the proper definitions of MIPS and MOPS. >As far as I know MIPS stands for Millions of instructions per second and MOPS >is millions of operations per second. >Now if you have mapped the algorithm on a VLIW/parallel machine, then each >'instruction' has more than one 'operation'. However if the platform is a >risc machine then each instruction has just one operation. So in case of >parallel machines it would be fair to evaluate through MOPS., while in risc >machines MOPS and MIPS both signify the same result.. The current definition of MIPS that I know of is: "Meaningless Indicator of Processor Speed" which makes sense under current architectures. Some people still find MFLOPS, Millions of Floating point Operations Per Second still useful, though it might make some difference if those operations were addition, multiplication, or division. Mostly this is interesting for some scientific programs that are mostly floating point and the number of floating point operations scales nicely with the problem size. For example, multiplying two N by N matrices takes pretty much N**3 floating multiply and N**3 floating add instructions, and the loop operations can likely be done in parallel. Running the program for different N and scaling by 2*N**3 will tell you how long it will take for not too small (startup overhead) or too large (page thrashing) N's. -- glen -- glenArticle: 37739
> ALWAYS > >>want my designs to use IOB flip flops if possible. It seems to me that > That's what you get for using Design Mangler...er...Manager ;-) heh. I find that make does a fair job of managing builds. But then, I always did find CLIs more user friendly than GUIs. Even if you invoke map from the commandline or means other than through DM, packing flops into I/Os is not done unless the -pr flag is supplied. So I suppose DM is following the defaults of map. M. Ramirez's question still holds good -- is there ever a reason not to pack flops into IOBs? -- David Miller, BCMS (Hons) | When something disturbs you, it isn't the Endace Measurement Systems | thing that disturbs you; rather, it is Mobile: +64-21-704-djm | your judgement of it, and you have the Fax: +64-21-304-djm | power to change that. -- Marcus AureliusArticle: 37740
Ditto "S. Ramirez" wrote: > "Bret Wade" <bret.wade@xilinx.com> wrote in message > news:3C20D7F1.4AB91218@xilinx.com... > > I think that packing registers is not the default map option because the > > expectation is that registers will have IOB=TRUE|FALSE attributes applied > to > > them by the front end tool. This attribute takes precedence over the -pr > map > > switch and allows for individual control of registers. > > > > Regards, > > Bret Wade > > Xilinx Product Applications > > Bret, > Every company where I have worked, we've never used the above mentioned > attribute. > Simon Ramirez -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 37741
Both schematics and HDL can be horrendous or stellar. I've seen examples both ways of both. In either case, proper use of hierarchy is the key to a maintainable design. Austin Franklin wrote: > Hi Austin, > > > If I see hdl code, at least I can see where it is going, even if it is > written > > badly. > > For me, that is not true. I have to wade around pages and pages of > text....where with a schematic, I can pick up what's going on almost > instantly. Schematics offer, if done right that is, a built-in block > diagram...which can not be done with text files very easily. The data flow > is FAR easier to pick out in a schematic than in HDL, and control logic may > or may not be easier to "understand" in HDL...it depends on how it's done. > > > Nice thing about software is that people have figured out how to manage > it, and > > document it. > > Why is that any different than schematics? > > > If I examine a design, from top to bottom, I can make a determination of > the > > quality of the design by examining the hdl code. It is possible, but more > > difficult to see what is going on in schematic. > > I believe the exact opposite. > > > As a > > technical manager, code review is one tool that should be used to make > sure the > > project is on track, following the rules, and has a higher likelihood of > > success. > > And you've never seen/done a schematic review? I believe schematics are FAR > easier to review than text files are. Anyway, design reviews are typically > NOT the source files, but the architecture...it's rare that one brings > source files to a design review and gives a copy to everyone in the room, > and people just sit around flipping through hundreds of pages of text > discussing constructs... > > I think you should attend my lecture in the spring on mixed design entry for > FPGA design ;-) > > Regards, > > Austin -- --Ray Andraka, P.E. President, the Andraka Consulting Group, Inc. 401/884-7930 Fax 401/884-7950 email ray@andraka.com http://www.andraka.com "They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, 1759Article: 37742
Stephen, Best case timing is actually pretty easy for the manufacturer. We know that the fastest corner of the silicon process is, we get parts from that corner, test them, make them cold, supply them with highest Vcc's, and see if the performance agrees with the models, and the extractions. We did not tend to specify minimums (although we do now, more and more in the newer parts) because it wasn't supposed to matter. Now that it does, we do provide that information once the process is stable (the part is in manufacturing as a regular product, not ES material). For those that wish to have a small delta between min and max, they must choose the fastest speed grade. This is because a slower speed grade may contain a device that is from the fastest corner (as it obviously is fast enough). As for the PCI core (an any IP), we warranty its operation partly by the specifications data sheet for the part, and partly by the exhaustive testing of the IP in real silicon. I would refer to this as specification by application. Just one more reason to use a core if it is available! Austin Stephen Byrne wrote: > I originally posted this yesterday on google groups, but I'm not seeing it > on my home news server. In case it is not visible to all, I'm reposting. > > Hello All, > > My company is currently comparing 66MHz PCI core solutions from Xilinx > and Altera, as well as debating using a home-spun core. One issue > I've come upon is the PCI requirement for a MAX clock-to-out time of 6 > ns and MIN clock-to-out time of 2 ns. Both the Xilinx ISE and Altera > Quartus II tools seem very helpful in supplying MAX (worst-case) Tco > times, but I don't see any info on best-case times. Apparently the > SDF files for back-annotated timing sim have the same worst-case > numbers repeated 3 times, resulting in the same simulation regardless > of case selection. My question is: how is anyone (FPGA vendors > included) guaranteeing a MIN Tco of 2 ns across all conditions and > parts if the design tools don't even yield that information? > > Thank You, > > Stephen ByrneArticle: 37743
> > > > Hi Simon, > > > > > > > > That's what you get for using Design Mangler...er...Manager ;-) > > > > > > > > Austin > > > > > > Hi Austin, > > > Is Max Pus II any better? > > > Simon > > > > Hi Simon, > > > > Have you been giving flying lessons to any "suspicious" characters > recently? > > > > I haven't used MaxPlus II in probably 6 or more years, so I couldn't tell > > you... > > > > Regards and Happy Holidays, > > > > Austin > > Austin, > It was a joke, Austin, it was a joke (similar to yours by the way) Hi Simon, Can't anyone be straight with you around ;-) > You and your wire and kids also have a Merry Christmas and Happy New > Year. My wire? Hum. I'll have to ask my wife about that... > Simon (In Beautiful 78 degree Florida) May you lose one bit for every byte! Regards, AustinArticle: 37744
Hi Carl, Is it efficient from the size or speed point? Thanks, AlexeiArticle: 37745
Something sounds wrong...aren't you registering your PCI signals in the IOBs, and are you using the built-in PCI logic? Making 33MHz in an SII should be a snap. "Kevin Brace" <kevinbraceusenetkillspam@hotmail.com.killspam> wrote in message news:3C202A1A.B51A40DD@hotmail.com.killspam... > Hi, I will like to know if someone knows the strategies on how to reduce > routing (net) delays for Spartan-II. > So far, I treated synthesis tool(XST)/Map/Par as a blackbox, but because > my design (a PCI IP core) was not meeting Tsu (Tsu < 7ns), I started to > take a closer look of how LUTs are placed on the FPGA. > Using Floorplanner, I saw the LUTs being placed all over the FPGA, so I > decided to hand place the LUTs using UCF flow. > That was the most effective thing I did to reduce interconnect delay > (reduced the worst interconnect delay by about 2.7 ns (11 ns down to 8.3 > ns)), but unfortunately, I still have to reduce the interconnect delay > by 1.3 ns (worst Tsu currently at 8.3 ns). > Basically, I have two input signals, FRAME# and IRDY# that are not > meeting timings. > Here are the two of the worst violators for FRAME# and IRDY#, > respectively. > > > > ____________________________________________________________________________ ____ > > > ============================================================================ ==== > Timing constraint: COMP "frame_n" OFFSET = IN 7 nS BEFORE COMP "clk" ; > > 503 items analyzed, 61 timing errors detected. > Minimum allowable offset is 8.115ns. > > -------------------------------------------------------------------------- ------ > Slack: -1.115ns (requirement - (data path - clock path > - clock arrival)) > Source: frame_n > Destination: PCI_IP_Core_Instance_ad_Port_2 > Destination Clock: clk_BUFGP rising at 0.000ns > Requirement: 7.000ns > Data Path Delay: 10.556ns (Levels of Logic = 6) > Clock Path Delay: 2.441ns (Levels of Logic = 2) > Timing Improvement Wizard > Data Path: frame_n to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tiopi 1.224 frame_n > frame_n_IBUF > net (fanout=45) 0.591 frame_n_IBUF > Tilo 0.653 PCI_IP_Core_Instance_I_25_LUT_7 > net (fanout=3) 0.683 N21918 > Tbxx 0.981 PCI_IP_Core_Instance_I_XXL_1357_1 > net (fanout=15) 2.352 PCI_IP_Core_Instance_I_XXL_1357_1 > Tilo 0.653 PCI_IP_Core_Instance_I_125_LUT_17 > net (fanout=1) 0.749 PCI_IP_Core_Instance_N3059 > Tilo 0.653 PCI_IP_Core_Instance_I__n0055 > net (fanout=1) 0.809 PCI_IP_Core_Instance_N3069 > Tioock 1.208 PCI_IP_Core_Instance_ad_Port_2 > ---------------------------- ------------------------------ > Total 10.556ns (5.372ns logic, 5.184ns route) > (50.9% logic, 49.1% route) > > Clock Path: clk to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tgpio 1.082 clk > clk_BUFGP/IBUFG > net (fanout=1) 0.007 clk_BUFGP/IBUFG > Tgio 0.773 clk_BUFGP/BUFG > net (fanout=423) 0.579 clk_BUFGP > ---------------------------- ------------------------------ > Total 2.441ns (1.855ns logic, 0.586ns route) > (76.0% logic, 24.0% route) > > > -------------------------------------------------------------------------- ------ > > > > ============================================================================ ==== > Timing constraint: COMP "irdy_n" OFFSET = IN 7 nS BEFORE COMP "clk" ; > > 698 items analyzed, 74 timing errors detected. > Minimum allowable offset is 8.290ns. > > -------------------------------------------------------------------------- ------ > Slack: -1.290ns (requirement - (data path - clock path > - clock arrival)) > Source: irdy_n > Destination: PCI_IP_Core_Instance_ad_Port_2 > Destination Clock: clk_BUFGP rising at 0.000ns > Requirement: 7.000ns > Data Path Delay: 10.731ns (Levels of Logic = 6) > Clock Path Delay: 2.441ns (Levels of Logic = 2) > Timing Improvement Wizard > Data Path: irdy_n to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tiopi 1.224 irdy_n > irdy_n_IBUF > net (fanout=138) 0.766 irdy_n_IBUF > Tilo 0.653 PCI_IP_Core_Instance_I_25_LUT_7 > net (fanout=3) 0.683 N21918 > Tbxx 0.981 PCI_IP_Core_Instance_I_XXL_1357_1 > net (fanout=15) 2.352 PCI_IP_Core_Instance_I_XXL_1357_1 > Tilo 0.653 PCI_IP_Core_Instance_I_125_LUT_17 > net (fanout=1) 0.749 PCI_IP_Core_Instance_N3059 > Tilo 0.653 PCI_IP_Core_Instance_I__n0055 > net (fanout=1) 0.809 PCI_IP_Core_Instance_N3069 > Tioock 1.208 PCI_IP_Core_Instance_ad_Port_2 > ---------------------------- ------------------------------ > Total 10.731ns (5.372ns logic, 5.359ns route) > (50.1% logic, 49.9% route) > > Clock Path: clk to PCI_IP_Core_Instance_ad_Port_2 > Delay type Delay(ns) Logical Resource(s) > ---------------------------- ------------------- > Tgpio 1.082 clk > clk_BUFGP/IBUFG > net (fanout=1) 0.007 clk_BUFGP/IBUFG > Tgio 0.773 clk_BUFGP/BUFG > net (fanout=423) 0.579 clk_BUFGP > ---------------------------- ------------------------------ > Total 2.441ns (1.855ns logic, 0.586ns route) > (76.0% logic, 24.0% route) > > > -------------------------------------------------------------------------- ------ > > > Timing summary: > --------------- > > Timing errors: 135 Score: 55289 > > Constraints cover 27511 paths, 0 nets, and 4835 connections (92.1% > coverage) > > ____________________________________________________________________________ ____ > > > Locations of various resources: > > FRAME#: pin 23 > IRDY#: pin 24 > AD[2]: pin 62 > PCI_IP_Core_Instance_I_25_LUT_7: CLB_R12C1.s1 > PCI_IP_Core_Instance_I_XXL_1357_1: CLB_R12C2 > PCI_IP_Core_Instance_I_125_LUT_17: CLB_R23C9.s0 > PCI_IP_Core_Instance_I__n0055: CLB_R24C9.s0 > > > > Input signals other than FRAME# and IRDY# are all meeting Tsu < 7 ns > requirement, and because I now figured out how to use IOB FFs, I can > easily meet Tval < 11 ns (Tco) for all output signals. > I am using Xilinx ISE WebPack 4.1 (which doesn't come with FPGA Editor), > and the PCI IP core is written in Verilog. > The device I am targeting is Xilinx Spartan-II 150K system gate speed > grade -5 part (XC2S150-5CPQ208), and I did meet all 33MHz PCI timings > with Spartan-II 150K system gate speed grade -6 part (XC2S150-6CPQ208) > when I resynthesized the PCI IP core for speed grade -6 part, and > basically reused the same UCF file with the floorplan (I had to make > small modifications to the UCF file because some of the LUT names > changed). > The reason I really care about Xilinx Spartan-II 150K system gate speed > grade -5 part is because that is the chip that is on the PCI prototype > board of Insight Electronics Spartan-II Development Kit. > Yes, I wish the PCI prototype board came with speed grade -6 . . . > Because I want the PCI IP core to be portable across different platforms > (most notably Xilinx and Altera FPGAs), I am not really interested in > making any vendor specific modification to my Verilog RTL code, but I > won't mind using various tricks in the .UCF file (for Xilinx) or .ACF > file (I believe that is the Altera equivalent of Xilinx .UCF file). > Here are some solutions I came up with. > > > 1) Reduce the signal fanout (Currently at 35 globally, but FRAME# and > IRDY#'s fanout are 200. What number should I reduce the global fanout > to?). > > 2) Use USELOWSKEWLINES in a UCF file (already tried on some long > routings, but didn't seem to help. I will try to play around with this > option a little more with different signals.). > > 3) Floorplan all the LUTs and FFs on the FPGA (currently, I only > floorplanned the LUTs that violated Tsu, and most of them take inputs > from FRAME# and IRDY#.). > > 4) Use Guide file Leverage mode in Map and Par. > > 5) Try routing my design 2000 times (That will take several days . . . I > once routed my design about 20 times. After routing my design 20 times, > Par seems to get stuck in certain Timing Score range beyond 20 > iterations.). > > 6) Pay for ISE Foundation 4.1 (I don't want to pay for tools because I > am poor), and use FPGA Editor (I wish ISE WebPack came with FPGA > Editor.). At least from FPGA Editor, I can see how the signals are > actually getting routed. > > 7) Use a different synthesis tool other than XST (I am poor, so I doubt > that I can afford.). > > > I will like to hear from anyone who can comment on the solutions I just > wrote, or has other suggestions on what I can do to reduce the delays to > meet 33MHz PCI's Tsu < 7 ns requirement. > > > > > Thanks, > > > > Kevin Brace (don't respond to me directly, respond within the newsgroup) > > > > > P.S. Considering that I am struggling to meet 33MHz PCI timings with > Spartan-II speed grade -5, how come Xilinx meet 66MHz PCI timings on > Virtex/Spartan-II speed grade -6? (I can only barely meet 33MHz PCI > timings with Spartan-II speed grade -6 using floorplanner.) > Is it possible to move a signal through a input pin like FRAME# and > IRDY# (pin 23 and pin 24 respectively for Spartan-II PQ208), go through > a few levels of LUTs, and reach far away IOB output FF and tri-state > control FF like pin 67 (AD[0]) or pin 203 (AD[31]) in 5 ns? (3 ns + 1.9 > to 2 ns natural clock skew = 4.9 ns to 5.0 ns realistic Tsu) > Can a signal move that fast on Virtex/Spartan-II speed grade -6? (I sort > of doubt from my experience.) > I know that Xilinx uses the special IRDY and TRDY pin in LogiCORE PCI, > but that won't seem to help FRAME#, since FRAME# has to be sampled > unregistered to determine an end of burst transfer. > What kind of tricks is Xilinx using in their LogiCORE PCI other than the > special IRDY and TRDY pin? > Does anyone know?Article: 37746
> Another bad news for a conversion service is that Clear Logic recently > lost a key ruling against Altera. > > > http://www.altera.com/corporate/press_box/releases/corporate/pr-wins_clear_l ogic.html > > > I sort of find the ruling troubling because assuming that an Altera-made > IP is not included in the customer's design, should anyone have any > control of the bit stream file you generated from Altera's software? > I suppose that what Altera wants to say is that because the customer had > to agree prior to using an Altera software (like MAX+PLUS II or > Quartus), the customer has to use the generated bit stream file in a way > agreed in the software licensing agreement. Not only is it bad for conversion services, but bad for all engineering. I am STUNNED that a court could find in Altera's favor! It's entirely absurd (and arrogant) for the court, and Altera, to claim you can't do what you want with the bitstream, license agreement or not! That's like getting a license agreement with a hammer that says you can only use it with particular brand of nails! Clear Logic must have had some bad lawyers. These kind of foolish, ignorant (IMO) court rulings just make the hair on the back of my neck stand up! It's YOUR design, and you should be able to do what every you want with it. Just because you used their tools in the design process should not limit your use of YOUR design. OK, I'm done ranting for a few minutes. AustinArticle: 37747
> I apply a set of multi-cycle constraints to a module and it works > fine, both timing analyzer and timing simulation. Then I incorporate > this module and a larger design and apply the the same constraints > again. This time timing analyzer reports is OK but the timing sim is > wrong. Any idea to solve the problem? How is the timing simulation wrong? Since you don't say, I will make a guess: is your timed simulation model is being generated without the "-xon false" flag to ngd2vhdl? Without that flag, ngd2vhdl propagates X's through flops whose inputs don't meet timing requirements. ngd2vhdl doesn't know that that path has a multicycle constraint on it, and may be flagging problems that aren't real. -- David Miller, BCMS (Hons) | When something disturbs you, it isn't the Endace Measurement Systems | thing that disturbs you; rather, it is Mobile: +64-21-704-djm | your judgement of it, and you have the Fax: +64-21-304-djm | power to change that. -- Marcus AureliusArticle: 37748
Bryan, it's been a while since I've done this (about 18 months), but I ended up doing it for pretty much the same reasons you did. I ran out of clock inputs for some logic that had to take clocks from outside and I wanted precisely repeatable timing for a bunch of asynchronous busses that came on to the chip. So I built the clock domain transfer circuitry into a hard macro and placed it near each input bus. Here's some notes: (1) As soon as you say "hard macro", you're going to exceed Xilinx's ability to support you. You're on your own. (2) The tools tend to blow up if you make too large a hard macro, or if you put too much routing on one. I only route the clocks. (3) I don't try to get hard macros to specify clocks to IOBs. Instead, I choose to bring the external clocks into pins that are within about 5 CLBs of the where I bring the data pins in. It just turns out that the router handles that particular clock combination with some grace. (4) The FPGA editor has gotten more and more difficult to use as time has gone on. XACT was far better. And yes, routing in hard macros is a pain in the butt. I'll write some notes as I walk through a hard macro routing problem: (a) From the "File" menu I open the "Main Properties" menu. (b) I turn off "Stub_Trimming", "AutomaticRouting", "EnhancedManual Routing", and "DelayBased Routing", then press "Apply" and "Close". (c) I turn off the "RatsNest" button because all the uncompleted Data paths will drive me nuts. (d) On the "List1" window, I get rid of "All Components" and replace it with "All Nets" (e) I select the clock I want to route and press the "hilite" button. (f) When I want to see where I can route to and that kind of stuff, I turn off the "Routes" button in the tool bar. That makes all the routes invisible (except the stuff I hilited), because the problem with the tool is that you can't select a segment of an even partially routed net without selecting the whole net. (g) When I see the next segment that I want to route the net to, I select that segment and hilite it too. (By the way, you can add buttons that allow you to hilite in different colors. This is very useful, and is found in the Xilinx documentation for FPGA Editor, but I'm assuming you haven't specialized your FPGA Editor to do this stuff.) (h) I can now route a single segment at a time by selecting the "from" segment with the left mouse button, holding down the "shift" key, selecting the "to" segment with the left mouse button, releasing the "shift" key, and then pressing the "route" button on the right hand panel. This actually works, believe it or not. The secret is that in order to properly manipulate routes, you have to turn off the visibility for routed lines. Where this gets stupid is when you try to route a bunch of stuff for something that is critical for route usage. You basically have to turn on visibility of routed stuff, hilite your intended route, then turn off route visiblity in order to make it happen. Enjoy. Carl -- Posted from firewall.terabeam.com [216.137.15.2] via Mailgate.ORG Server - http://www.Mailgate.ORGArticle: 37749
"Bryan" <bryan@srccomp.com> wrote in message news:3c20c0ef$0$25796$4c41069e@reader1.ash.ops.us.uu.net... > Ok, this is great. Two other times I posted to this group about making hard > macros and got no response. The low road paid off it seems. I really want > to get these hard macros with locked routing to work. What I see when I try > to make a hard macro is that I cannot manually route the nets in editor. I > want to hand route the clock signal on local routing and hand route an > enable. The first thing that happens is I start routing the clock through a > switchbox from a normal IBUF, works fine. Then I try to re-select the input > sight to the switchbox to make a split in the routing. I cannot select the > input bubble again and if I just select the net and a new destination out of > the switchbox it complains. I have a routed design open in editor right > next to the one I am hand routing. Since I couldn't get this to work, I > decided to first see if I could route the line by hand like the router did. > I know I am selecting usable sights because I am cheating of a routed > design. So how do I re-select that bubble to continue hand routing? Bryan, it's been a while since I've done this (about 18 months), but I ended up doing it for pretty much the same reasons you did. I ran out of clock inputs for some logic that had to take clocks from outside and I wanted precisely repeatable timing for a bunch of asynchronous busses that came on to the chip. So I built the clock domain transfer circuitry into a hard macro and placed it near each input bus. Here's some notes: (1) As soon as you say "hard macro", you're going to exceed Xilinx's ability to support you. You're on your own. (2) The tools tend to blow up if you make too large a hard macro, or if you put too much routing on one. I only route the clocks. (3) I don't try to get hard macros to specify clocks to IOBs. Instead, I choose to bring the external clocks into pins that are within about 5 CLBs of the where I bring the data pins in. It just turns out that the router handles that particular clock combination with some grace. (4) The FPGA editor has gotten more and more difficult to use as time has gone on. XACT was far better. And yes, routing in hard macros is a pain in the butt. I'll write some notes as I walk through a hard macro routing problem: (a) From the "File" menu I open the "Main Properties" menu. (b) I turn off "Stub_Trimming", "AutomaticRouting", "EnhancedManual Routing", and "DelayBased Routing", then press "Apply" and "Close". (c) I turn off the "RatsNest" button because all the uncompleted Data paths will drive me nuts. (d) On the "List1" window, I get rid of "All Components" and replace it with "All Nets" (e) I select the clock I want to route and press the "hilite" button. (f) When I want to see where I can route to and that kind of stuff, I turn off the "Routes" button in the tool bar. That makes all the routes invisible (except the stuff I hilited), because the problem with the tool is that you can't select a segment of an even partially routed net without selecting the whole net. (g) When I see the next segment that I want to route the net to, I select that segment and hilite it too. (By the way, you can add buttons that allow you to hilite in different colors. This is very useful, and is found in the Xilinx documentation for FPGA Editor, but I'm assuming you haven't specialized your FPGA Editor to do this stuff.) (h) I can now route a single segment at a time by selecting the "from" segment with the left mouse button, holding down the "shift" key, selecting the "to" segment with the left mouse button, releasing the "shift" key, and then pressing the "route" button on the right hand panel. This actually works, believe it or not. The secret is that in order to properly manipulate routes, you have to turn off the visibility for routed lines. Where this gets stupid is when you try to route a bunch of stuff for something that is critical for route usage. You basically have to turn on visibility of routed stuff, hilite your intended route, then turn off route visiblity in order to make it happen. Enjoy. Carl -- Posted from firewall.terabeam.com [216.137.15.2] via Mailgate.ORG Server - http://www.Mailgate.ORG
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z