Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Peter Alfke wrote > The advantage of built-in PowerPC microprocessor is that it connects > very well to the logic fabric ( the CLBs, BlockRAMs, etc.) In > Virtex-IIPro, each PPC has about 700 connections to the fabric, with > several 64-wide busses. > Obviously, you could use an external PPC, but that would not only mean an > additional package, it would also mean many hundreds of FPGA pins being > wasted on interfacing to the external PPC. More space, more power, less > reliability, and most likely lower system performance. > The tight and flexible connection between PPC and the logic is the > biggest advantage. It looks as if the '405 consumes the space of 512 LUTs, ignoring any dedicated layers on the chip. 512 LUTs is midway between an XC2S30 and an XC2S50. But maybe the design was harder than a typical XC2S50 implementation :-)Article: 40801
85 ps per inch works for free space. A better approximation would be 170 ps per inch for internal traces where the relative permitivity of about 4 for FR4 material at high frequency gives a good approximation (sqrt(4) for scaling to free space). The outside traces are a bit faster because they're propagating through a combination of air and FR4 - I don't have the numbers handy for those speeds. You're right about the stackup being important - the trace widths and plane spacings need to be well specified by you to get the board house to provide impedances that won't over/undershoot. A little mistermination is fine - the mid-transition reversals are what kills; those can occur when the driver sees a low impedance for much of the risetime but gets the reflection coming back before the clock's out of the transition region. There's the beauty of SI - this general info gets applied. With everything close and distributed capacitance throughout, you could get smooth transitions with the star configuration, but it's dependent on the drivers and input capacitance. Are independent clocks from the FPGA something you want to avoid? "Zero delay buffers" are part of the clock management's best application. It's often better from a debug standpoint to have access to individual terminations if things go desperately wrong. rickman wrote: > Austin, thanks for the simulation. > > This looks like great data. But I am not sure if you were trying to > help by doing my simulation for me, or if you were just trying to show > what the tool can do. > > I am not clear about what this is simulating. Obviously you used the > daisy chain model, but how do you know what to use as a trace impedance > and where did the delays come from? The preliminary layout I am using > has the following delays in the daisy chain case, assuming 100pS per > inch. Is that a valid assumption? > > DSP to FPGA 100pS > FPGA to SDRAM1 50pS > SDRAM1 to SDRAM2 50pS > SDRAM2 to SBSRAM 100pS > > Don't I need to caclulate the trace impedance from the PCB design > rules? The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8 > layers with a total thickness of 0.062". Of course, I can use wider > traces for the clock and control which layer they are on. > > I would expect these four loads to behave much better than the five > loads with 200+ pS delays. > > If you were just trying to demonstate the tool, that's fine. But if you > were trying to simulate my case, these are the data that should be > used. > > When I am done my other work today, I will try downloading the software > and giving it a try this weekend or next week. > > Austin Lesea wrote: > > > > Rick, > > > > [Image] > > > > Parallel termination (shown above) is great for daisy chained clocks. > > Of course, you have to deal with the timing, and the delays (or > > skews). > > > > Another great thing that is easy to do in HyperLynx using IBIS models. > > > > Austin > > > > PS: > > > > Here is no termination .... > > [Image] > > > > Note some devices don't get any clocks at all ..... > > > > > > rickman wrote: > > > > > I need to plan a high speed bus that will connect 5 devices. They > > > will > > > all be very closely spaced so that the lengths of the routes can be > > > kept > > > pretty short. The clock line is the one I am most concerned about. > > > It > > > is 100 MHz ECLKOUT from a TI C6711 DSP. The five devices are an > > > SBSRAM, > > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E. > > > > > > The longest as-the-crow-flys run is 1.4" with 1 inch x and 1 inch y > > > if > > > you keep it square (as layout guys like to do). The other signals > > > are > > > within the box these two points inscribe. > > > > > > Another approach would be to daisy chain them which would make the > > > total > > > run about 3 inches. What type of termination could I expect to work > > > > > > well with this type of run? > > > > > > With such short runs, I was thinking about using no termination with > > > a > > > star topology. I am not even sure I need to worry about keeping the > > > net > > > delays equal since the variation will be less than +- 1 inch or > > > about > > > 100 pS of clock skew. > > > > > > Anyone have much experience with running high speed clocks on such > > > short > > > runs? Can I expect this to work well? > > > > > > I know Austin will tell me to simulate it, which I plan to do. I am > > > > > > just trying to get a "gut" feeling as Bob Pease would want to do. > > > You > > > know how easy it is to get the WRONG, right answer from a computer. > > > GIGO. > > > > > > -- > > > > > > Rick "rickman" Collins > > > > > > rick.collins@XYarius.com > > > Ignore the reply address. To email me use the above address with the > > > XY > > > removed. > > > > > > Arius - A Signal Processing Solutions Company > > > Specializing in DSP and FPGA design URL http://www.arius.com > > > 4 King Ave 301-682-7772 Voice > > > Frederick, MD 21701-3110 301-682-7666 FAX > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAXArticle: 40802
rickman <spamgoeshere4@yahoo.com> writes: > Magnus Homann wrote: > > > > rickman <spamgoeshere4@yahoo.com> writes: > > > > > I need to plan a high speed bus that will connect 5 devices. They will > > > all be very closely spaced so that the lengths of the routes can be kept > > > pretty short. The clock line is the one I am most concerned about. It > > > is 100 MHz ECLKOUT from a TI C6711 DSP. The five devices are an SBSRAM, > > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E. > > > > Is this differential? In that case I would go for daisychaining and > > termination at the end. SHORT stubs at intermediate devices. > > > > Homann > > No this is not differential. This is LVTTL. Ah, in that case I would ask my boss to be put on another project... Or use a zero-delay clock buffer (PLL/DLL), if possible. > BTW, what do you mean by > SHORT? Is that anything like telling someone to pay CAREFULL attention > to signal routing? :) EXACTLY :-) Homann -- Magnus Homann, M.Sc. CS & E d0asta@dtek.chalmers.seArticle: 40803
Dan wrote: > > Hello, > > I have designed my own PCI logic for a target board.(33/32) It works in the > majority of wintel PCs but crashes in a significant number of PCs. > When I first fired up my own synthesizable vendor independent Verilog RTL PCI IP core in Insight Electronics Spartan-II PCI development board, I got setup time (Tsu) of about 11ns, and clock-to-output valid time (Tval) of about 15.5ns. I tested the card with Intel 430TX chipset-based motherboard and SiS 5598 chipset-based motherboard, and at least the configuration register access part worked okay. The I/O read/write part didn't work at all (crashed the computer), but eventually I figured out the problem through RTL and Post P&R simulation. Since then, I have improved my logic design skills significantly, so meeting 33MHz PCI's Tsu < 7ns is very easy after some manual floorplanning. I didn't use any special EDA tools that cost thousands of dollars to do that. I only can afford ISE WebPACK 4.1, so I synthesized my design with XST (Xilinx Synthesis Technology), and simulated it (Post P&R simulation) with ModelSim XE-Starter. > I have implemented a design in a Xilinx PCI proto board made by Insight. > This way I can assume that the PCB fabrication is sound. > I don't have any PCB design experience, but looking at the component quality of the board, I can at least say that it is far better than most PCI cards I have seen sold at computer stores/dealers. > I feel the problem comes down to the way TRDY# and DEVSEL# are being driven. > This is the logic that must be improved. > > The crashing occurs with reads. With my exisiting logic one motherboard may > crash while another is ok. On a motherboard that is ok, the addition of a > certain 3rd party PCI card will then result in crashes. The logic I have > must be on the verge of being PCI compliant. I expect that one little tweak > should be enough to clear up all my problems. > What is the Tsu of your PCI interface design? Yes, mine sort of worked okay at 11ns, but if it is worst than that, things might start to go wrong. Plus, I heard that more loading on the bus will make the signals slow down, so that might be another reason the PCI card failed. Also, have you used an oscilloscope or a logic analyzer to look at the waveform? In my case, I couldn't afford either one of them, so I relied heavily on ModelSim XE-Starter's Post P&R simulation feature where I saw XST messing up the synthesis. After I turned off some optimization options, the Post P&R simulation went okay, so I put the board into a computer, and it worked absolutely fine. Yes, the Tsu was still 11ns . . . The PCI bus had a PCI graphics card other than the Spartan-II PCI card. > In tracking down the problem I have removed more and more logic to simplify > the design and to narrow in on the cause. Almost all the logic is now gone. > All that remains is: > > OUTPUTS: > TRDY# > DEVSEL# > > INPUTS: > FRAME# > IRDY# > AD[31:00] > C/BE[3:0]# > CLK > > In this stripped down implementation there are no bursts, no parity, no > master logic, no configuration space ( which is not needed to effect > reads/writes if you know of a conflict free address, which I do for test > purposes ) Implementing the Configuration Address Space or Configuration Registers is a requirement of the PCI specification. Although a lot of PCI devices don't bother to detect parity errors (address or data), parity generation is a must in a read cycle. If you don't do that, in some systems that check for parity errors, the host PCI bridge might be asserting SERR# that might cause NMI (Non Maskable Interrupt) to shut the computer down. I heard some chipsets don't let PCI devices that don't implement configuration registers function at all by cutting off clock supply to it. What are the chipsets do the motherboards you got have? > The logic is very simple. When a read or write is decoded I take DEVSEL# > active low, followed by TRDY#. When IRDY# is seen low I release TRDY# and > DEVESEL# on the next clock.(wintels do not do burst reads so I know FRAME# > will indicate a single data phase cycle) I think you are making poor assumptions here. You should never assume that burst read cycle won't occur. Even if burst read cycle won't occur, write burst cycles known to occur in x86 host PCI bridges, and to safeguard against that, you must assert STOP# for each transaction. You can assert STOP# blindly simultaneously with TRDY# assertion, but you will have likely add another state machine state called backoff to wait until FRAME# is deasserted. In backoff state TRDY# will have to be deasserted while STOP# being asserted until FRAME# is deasserted. See PCI specification Appendix B's state machine example for what I am talking about. > This complete test is a trivial > and small piece of logic. ( For anyone designing their own PCI logic this is > an excellent first step to try. Once this works then you would go on to add > other features.) Note that I am not even driving AD[31:00] on the reads. So > the only PCI signals that I drive in response to a decoded PCI read in my > address space is DEVSEL# and TRDY#. > > When a crash occurs it happens on a read but not every read so as you can > see this is erractic. It sounds like you didn't simulate the PCI interface before firing it up. Mine didn't work perfectly the first time because I didn't feel like simulating it (I was so anxious to fire it up.). Nowadays, I will never burn a Configuration PROM without doing Post P&R simulation, and making sure the synthesis tool synthesized the design correctly. > Note: all PCI input and output signals are clocked. > > My schematics will be provided to anyone who requests them. > What schematics software did you use? Have you considered using HDL? Even when using synthesizable Verilog RTL code, I got the levels of LUT fairly low enough to meet 33MHz PCI's Tsu after some floorplanning, so it is possible to implement a PCI IP core in HDL. > Is there anyone out there who has gone down this road designing their own > PCI logic for a FPGA ? Come on over. I have a plane ticket for you. Name > your price. > > Sincerely > Daniel DeConinck > www.PixelSmart.com > TEL: 416-248-4473 Really? Are you really going to pay someone to look at your design? Isn't it faster to just pay $2,000 for a Xilinx LogiCORE PCI license for Spartan-II? Well, I shouldn't really say this since mine isn't completely done, but if you are going to pay something, I won't mind letting you use my still beta version of PCI IP core. The "something", of course, will be far less than $2,000. If you are interested, let me know that. You may also want to take a look at opencores.org's free PCI IP core project. http://www.opencores.org/projects/pci My biased opinion (Because it is always easier to understand your own code than someone else's.) of the code I saw is that the style they wrote the code (gate level like HDL code) makes the thing really hard to understand other than the original authors. Well, it is free, so it doesn't hurt to take a look at it though, but I don't think you should expect too much. That is just my biased opinion. Kevin BraceArticle: 40804
Rick, Just showing an example, you need your own IBIS models for each driver and receiver, and of course, your pcb trace lengths and impedances, or their widths and spacing, and the pcb stackup. Austin rickman wrote: > Austin, thanks for the simulation. > > This looks like great data. But I am not sure if you were trying to > help by doing my simulation for me, or if you were just trying to show > what the tool can do. > > I am not clear about what this is simulating. Obviously you used the > daisy chain model, but how do you know what to use as a trace impedance > and where did the delays come from? The preliminary layout I am using > has the following delays in the daisy chain case, assuming 100pS per > inch. Is that a valid assumption? > > DSP to FPGA 100pS > FPGA to SDRAM1 50pS > SDRAM1 to SDRAM2 50pS > SDRAM2 to SBSRAM 100pS > > Don't I need to caclulate the trace impedance from the PCB design > rules? The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8 > layers with a total thickness of 0.062". Of course, I can use wider > traces for the clock and control which layer they are on. > > I would expect these four loads to behave much better than the five > loads with 200+ pS delays. > > If you were just trying to demonstate the tool, that's fine. But if you > were trying to simulate my case, these are the data that should be > used. > > When I am done my other work today, I will try downloading the software > and giving it a try this weekend or next week. > > Austin Lesea wrote: > > > > Rick, > > > > [Image] > > > > Parallel termination (shown above) is great for daisy chained clocks. > > Of course, you have to deal with the timing, and the delays (or > > skews). > > > > Another great thing that is easy to do in HyperLynx using IBIS models. > > > > Austin > > > > PS: > > > > Here is no termination .... > > [Image] > > > > Note some devices don't get any clocks at all ..... > > > > > > rickman wrote: > > > > > I need to plan a high speed bus that will connect 5 devices. They > > > will > > > all be very closely spaced so that the lengths of the routes can be > > > kept > > > pretty short. The clock line is the one I am most concerned about. > > > It > > > is 100 MHz ECLKOUT from a TI C6711 DSP. The five devices are an > > > SBSRAM, > > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E. > > > > > > The longest as-the-crow-flys run is 1.4" with 1 inch x and 1 inch y > > > if > > > you keep it square (as layout guys like to do). The other signals > > > are > > > within the box these two points inscribe. > > > > > > Another approach would be to daisy chain them which would make the > > > total > > > run about 3 inches. What type of termination could I expect to work > > > > > > well with this type of run? > > > > > > With such short runs, I was thinking about using no termination with > > > a > > > star topology. I am not even sure I need to worry about keeping the > > > net > > > delays equal since the variation will be less than +- 1 inch or > > > about > > > 100 pS of clock skew. > > > > > > Anyone have much experience with running high speed clocks on such > > > short > > > runs? Can I expect this to work well? > > > > > > I know Austin will tell me to simulate it, which I plan to do. I am > > > > > > just trying to get a "gut" feeling as Bob Pease would want to do. > > > You > > > know how easy it is to get the WRONG, right answer from a computer. > > > GIGO. > > > > > > -- > > > > > > Rick "rickman" Collins > > > > > > rick.collins@XYarius.com > > > Ignore the reply address. To email me use the above address with the > > > XY > > > removed. > > > > > > Arius - A Signal Processing Solutions Company > > > Specializing in DSP and FPGA design URL http://www.arius.com > > > 4 King Ave 301-682-7772 Voice > > > Frederick, MD 21701-3110 301-682-7666 FAX > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAXArticle: 40805
"Magnus Homann" <d0asta@mis.dtek.chalmers.se> schrieb im Newsbeitrag news:ltsn71cyyw.fsf@mis.dtek.chalmers.se... > > > Is this differential? In that case I would go for daisychaining and > > > termination at the end. SHORT stubs at intermediate devices. > > > > > > Homann > > > > No this is not differential. This is LVTTL. > > Ah, in that case I would ask my boss to be put on another project... You are a big Sissy. SCNR. ;-)) A little bit more serious, is a 100 MHz LVTTL clock propagating some inches on a FR4 board that difficult to handle?? I mean, sure, lots of things can go wrong if you dont know what you are doing, BUT Two guys in our company designed a board with a big communication processor, with 3 fast SDRAM/SSRAM/ZBTRAM busses (100-133 MHz). They did NO simulation, "just" had an eye at the layout and followed the basic guidelines that apply on this kind of stuff. And you wont believe it, it worked on the first run, almost perfect, just some minor modification of termination resistors and some clock line (length) modification. Your comment, Austin?? ;-) > Or use a zero-delay clock buffer (PLL/DLL), if possible. No problem, there are at least 4 inside the FPGA, Virtex-E/-II has even more. -- MfG FalkArticle: 40806
Thanks for your comments John. I guess I am a little green with clocks above 50 MHz. I was expecting runs this short to be pretty simple and not to have to do too much to make it work. But with the input from Austin and yourself as well as some others, I do at least plan to take a first pass at a simulation. My main concern is that you have to know a lot of details about the board to run a USEFUL simulation. I have learned a lot from reading some of Bob Pease's articles and I fully realize that a simulation won't do me a lick of good if I don't make all the right assumptions. I have been working on a very tightly packed switching power supply while I am doing the digital stuff and I am finding that I can make it look pretty feasible if I make THESE assumptions and I can make it look pretty impossible if I make THOSE assumptions. I think we won't really know how well it will work until we fire it up on the final board layout. I expect that we will see the same sort of thing with this clock design. I did consider using a zero delay buffer. But this board is very tight for space and I have a hard time justifying it with 1.5 inch traces. But if the simulation shows a problem, of course we will do what we have to. The SDRAM and SBSRAM are 4 pF input cap max, the FPGA says 8 pF max. This is another difference between the simulation Austin did and what I have. He seems to have used all VII inputs with 10 pF capacitance. It is also not clear to me if the simulations are being done with typical values or worst case values. If typ values are used, then I don't see how the results have any meaning at all. Rick Collins John_H wrote: > > 85 ps per inch works for free space. A better approximation would be 170 ps > per inch for internal traces where the relative permitivity of about 4 for > FR4 material at high frequency gives a good approximation (sqrt(4) for > scaling to free space). The outside traces are a bit faster because they're > propagating through a combination of air and FR4 - I don't have the numbers > handy for those speeds. > > You're right about the stackup being important - the trace widths and plane > spacings need to be well specified by you to get the board house to provide > impedances that won't over/undershoot. A little mistermination is fine - > the mid-transition reversals are what kills; those can occur when the > driver sees a low impedance for much of the risetime but gets the reflection > coming back before the clock's out of the transition region. There's the > beauty of SI - this general info gets applied. > > With everything close and distributed capacitance throughout, you could get > smooth transitions with the star configuration, but it's dependent on the > drivers and input capacitance. > > Are independent clocks from the FPGA something you want to avoid? "Zero > delay buffers" are part of the clock management's best application. It's > often better from a debug standpoint to have access to individual > terminations if things go desperately wrong. > > rickman wrote: > > > Austin, thanks for the simulation. > > > > This looks like great data. But I am not sure if you were trying to > > help by doing my simulation for me, or if you were just trying to show > > what the tool can do. > > > > I am not clear about what this is simulating. Obviously you used the > > daisy chain model, but how do you know what to use as a trace impedance > > and where did the delays come from? The preliminary layout I am using > > has the following delays in the daisy chain case, assuming 100pS per > > inch. Is that a valid assumption? > > > > DSP to FPGA 100pS > > FPGA to SDRAM1 50pS > > SDRAM1 to SDRAM2 50pS > > SDRAM2 to SBSRAM 100pS > > > > Don't I need to caclulate the trace impedance from the PCB design > > rules? The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8 > > layers with a total thickness of 0.062". Of course, I can use wider > > traces for the clock and control which layer they are on. > > > > I would expect these four loads to behave much better than the five > > loads with 200+ pS delays. > > > > If you were just trying to demonstate the tool, that's fine. But if you > > were trying to simulate my case, these are the data that should be > > used. > > > > When I am done my other work today, I will try downloading the software > > and giving it a try this weekend or next week. ...snip... -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 40807
Falk Brunner wrote: > > > ??? AFAIK not all PCI inputs can be registered. > Right, not all inputs can be registered, and that's the hardest part of a PCI IP core implementation I think. > > I dont. But Kevin is doing so since a while. And in his threads, it came > out, that the timing of TRDY and IRDY is critical, and Xilinx provides > special macros to implement this logic, some kind of black magic voodoo-box > ;-)) > Have a look at the pinouts, they name TRDY and IRDY a special IO-Pins. > > Just my 2 EURO.cents ;-) > -- > MfG > Falk About a month ago, I finally figured out a way to activate that mysterious PCILOGIC thing people were talking about following someone's analysis on how to instantiate it. I did it from ISE WebPACK 4.1 without using FPGA Editor since Xilinx doesn't include it. I can E-mail the sample code to anyone interested. After all, the PCILOGIC is just a logic that generates CE (Clock Enable) for the datapath (internal data source to AD[31:0] output FF), and a 5-input LUT or two 4-input LUTs cascaded can emulate that, but it will be slower because routing delay will be less predictable. So, I attached PCILOGIC to my PCI IP core, and the thing worked fine in Post P&R simulation. I was lazy, so I didn't actually test in a real system though. But this supposedly magic box won't solve the problem of keeping levels of LUT low for control signals (FRAME#, IRDY#, DEVSEL#, TRDY#, and STOP#), and I think that is the hardest part of a PCI IP core design because it will impact setup time. For Spartan-II PQ208 package, pin 24 is for IRDY# and pin 27 is for TRDY#. The opposite side of the chip also has another PCILOGIC, too. Kevin BraceArticle: 40808
Falk Brunner wrote: > > "Austin Lesea" <austin.lesea@xilinx.com> schrieb im Newsbeitrag > news:3C92136E.B45304E1@xilinx.com... > > Rick, > > > > Virtex II is also "5V compatible" when the current into the pin is less > than ~ 10 to > > 12 mA, so the 100 ohms works for Virtex II, too. > > Hm. After all, I think in most cases, you dont need such a high-tec "toy" > like Virtex-II when there are still some "old guys" with 5V on your > board(yes, PCI is one of these exceptions). A Spartan-II will do the job (i > think . . .). Or some functions and the 5V Interface is put into a > Spartan-II, and the rest into a Virtex-II (interfacing with the Spartan-II > at 3.3V or even 2.5V, IO Banking rules ;-). > > Regards > Falk I ended up adding a XCR3256XL to my board just so I could interface to a 5 volt bus. The main FPGA is an XC2S200E which could have been an XC2S200, but I didn't want to add a fifth power domain just for one part. This board only has about 10 ICs on it and it has four power areas with three voltages. I may have a fifth when I add the battery backup :) -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 40809
Rick, Use the fast/strong IBIS corner, and that covers the worst case for process/voltage/temperature. The capactances don't matter all that much (convince yourself by changing them, and you will see that the results don't change much at all). The claim of IBIS simulator vendors is that it saves all of the PCB spins to fix SI, and they are right. The sims are not that fussy, and all you need to be sure of are the parts and their models, and the PCB impedance and the lengths. Input models are not fussy either, hence my use of VII for all inputs is probably +/- 10% of the real measured result. All CMOS inputs look pretty much the same. If you have wide buses, then crosstalk is important, and you need a little more detail for the geometry. Austin rickman wrote: > Thanks for your comments John. > > I guess I am a little green with clocks above 50 MHz. I was expecting > runs this short to be pretty simple and not to have to do too much to > make it work. But with the input from Austin and yourself as well as > some others, I do at least plan to take a first pass at a simulation. > My main concern is that you have to know a lot of details about the > board to run a USEFUL simulation. I have learned a lot from reading > some of Bob Pease's articles and I fully realize that a simulation won't > do me a lick of good if I don't make all the right assumptions. > > I have been working on a very tightly packed switching power supply > while I am doing the digital stuff and I am finding that I can make it > look pretty feasible if I make THESE assumptions and I can make it look > pretty impossible if I make THOSE assumptions. I think we won't really > know how well it will work until we fire it up on the final board > layout. I expect that we will see the same sort of thing with this > clock design. > > I did consider using a zero delay buffer. But this board is very tight > for space and I have a hard time justifying it with 1.5 inch traces. > But if the simulation shows a problem, of course we will do what we have > to. > > The SDRAM and SBSRAM are 4 pF input cap max, the FPGA says 8 pF max. > This is another difference between the simulation Austin did and what I > have. He seems to have used all VII inputs with 10 pF capacitance. > > It is also not clear to me if the simulations are being done with > typical values or worst case values. If typ values are used, then I > don't see how the results have any meaning at all. > > Rick Collins > > John_H wrote: > > > > 85 ps per inch works for free space. A better approximation would be 170 ps > > per inch for internal traces where the relative permitivity of about 4 for > > FR4 material at high frequency gives a good approximation (sqrt(4) for > > scaling to free space). The outside traces are a bit faster because they're > > propagating through a combination of air and FR4 - I don't have the numbers > > handy for those speeds. > > > > You're right about the stackup being important - the trace widths and plane > > spacings need to be well specified by you to get the board house to provide > > impedances that won't over/undershoot. A little mistermination is fine - > > the mid-transition reversals are what kills; those can occur when the > > driver sees a low impedance for much of the risetime but gets the reflection > > coming back before the clock's out of the transition region. There's the > > beauty of SI - this general info gets applied. > > > > With everything close and distributed capacitance throughout, you could get > > smooth transitions with the star configuration, but it's dependent on the > > drivers and input capacitance. > > > > Are independent clocks from the FPGA something you want to avoid? "Zero > > delay buffers" are part of the clock management's best application. It's > > often better from a debug standpoint to have access to individual > > terminations if things go desperately wrong. > > > > rickman wrote: > > > > > Austin, thanks for the simulation. > > > > > > This looks like great data. But I am not sure if you were trying to > > > help by doing my simulation for me, or if you were just trying to show > > > what the tool can do. > > > > > > I am not clear about what this is simulating. Obviously you used the > > > daisy chain model, but how do you know what to use as a trace impedance > > > and where did the delays come from? The preliminary layout I am using > > > has the following delays in the daisy chain case, assuming 100pS per > > > inch. Is that a valid assumption? > > > > > > DSP to FPGA 100pS > > > FPGA to SDRAM1 50pS > > > SDRAM1 to SDRAM2 50pS > > > SDRAM2 to SBSRAM 100pS > > > > > > Don't I need to caclulate the trace impedance from the PCB design > > > rules? The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8 > > > layers with a total thickness of 0.062". Of course, I can use wider > > > traces for the clock and control which layer they are on. > > > > > > I would expect these four loads to behave much better than the five > > > loads with 200+ pS delays. > > > > > > If you were just trying to demonstate the tool, that's fine. But if you > > > were trying to simulate my case, these are the data that should be > > > used. > > > > > > When I am done my other work today, I will try downloading the software > > > and giving it a try this weekend or next week. > > ...snip... > > -- > > Rick "rickman" Collins > > rick.collins@XYarius.com > Ignore the reply address. To email me use the above address with the XY > removed. > > Arius - A Signal Processing Solutions Company > Specializing in DSP and FPGA design URL http://www.arius.com > 4 King Ave 301-682-7772 Voice > Frederick, MD 21701-3110 301-682-7666 FAXArticle: 40810
Falk, Yes. But, if the lengths are all real short, and the rise time fairly slow, perhaps no termination at all is needed. Austin Falk Brunner wrote: > "Magnus Homann" <d0asta@mis.dtek.chalmers.se> schrieb im Newsbeitrag > news:ltg03190da.fsf@mis.dtek.chalmers.se... > > rickman <spamgoeshere4@yahoo.com> writes: > > > > > I need to plan a high speed bus that will connect 5 devices. They will > > > all be very closely spaced so that the lengths of the routes can be kept > > > pretty short. The clock line is the one I am most concerned about. It > > > is 100 MHz ECLKOUT from a TI C6711 DSP. The five devices are an SBSRAM, > > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E. > > > > > > Is this differential? In that case I would go for daisychaining and > > termination at the end. SHORT stubs at intermediate devices. > > Isnt end termination the ONLY clean way when daisy-chaining??? (According to > the "bible" from Howard Johnson) > > Regards > FalkArticle: 40811
Falk, I agree. Austin Falk Brunner wrote: > "Austin Lesea" <austin.lesea@xilinx.com> schrieb im Newsbeitrag > news:3C92136E.B45304E1@xilinx.com... > > Rick, > > > > Virtex II is also "5V compatible" when the current into the pin is less > than ~ 10 to > > 12 mA, so the 100 ohms works for Virtex II, too. > > Hm. After all, I think in most cases, you dont need such a high-tec "toy" > like Virtex-II when there are still some "old guys" with 5V on your > board(yes, PCI is one of these exceptions). A Spartan-II will do the job (i > think . . .). Or some functions and the 5V Interface is put into a > Spartan-II, and the rest into a Virtex-II (interfacing with the Spartan-II > at 3.3V or even 2.5V, IO Banking rules ;-). > > Regards > FalkArticle: 40812
When I quoted the zero delay buffers, I was trying to point you back into the FPGA. The DLLs and DCMs can produce clocks that are very nicely phase related to a single clock, duplicating the functionality of a zero delay buffer without an external part. rickman wrote: > I did consider using a zero delay buffer. But this board is very tight > for space and I have a hard time justifying it with 1.5 inch traces. > But if the simulation shows a problem, of course we will do what we have > to. > > John_H wrote: > > > > Are independent clocks from the FPGA something you want to avoid? "Zero > > delay buffers" are part of the clock management's best application. It's > > often better from a debug standpoint to have access to individual > > terminations if things go desperately wrong.Article: 40813
Hi all, If you push into the DCM block inside Xilinx FPGA Editor, you will find a bunch of option boxes. What does Factory_JF do? From Xilinx answer database, it is described as jitter filter function, but users are discourage to use them. So...what do they do and what do those values mean? It would be nice if I can set them to user specific center frequency for whatever clock input I have, not just a "high" or "low" option. Jon HoArticle: 40814
What is the last set of events that takes place on the bus when a crash occurs. Do you have an access to a logic analyzer? I extensively used TLA714 with a Newwave adapter + LA software for PCI decoding. It may save you a lot of time. -- YWSArticle: 40815
Oh, one more thing I thought of. Does your state machine have a bus busy state? If it doesn't (In other words, remaining at an idle state if the transaction is not for itself.), your PCI interface might mistakenly start a transaction, leading to a crash. In PCI, FRAME# = 'L' and IRDY# = 'H' signals the start of a transaction, and this can continue for multiple cycles during the first data phase, but this condition can also occur if an initiator device is inserting wait states in the middle of a burst cycle (In the middle = after the first microaccess. ). Kevin BraceArticle: 40816
Hello, The spartan II IOBs have three FFs. One for data in, one for data out and one for the tristate control signal. My design entry is schematic based ( Viewdraw) To use the data flip flops in the IOBs I use IFD instead of just FD and the tools know to place the FF in the IOB. But there is no symbol for the tristate control FF. I tried using the constraint IOB=TRUE on a FF but this failed to use the IOB FF. Help Please. Sincerely Daniel DeConinckArticle: 40817
Thanks for the input. Which signals can not be registered ? Do you mean TRDY and IRDY ? Xilinx told me that the dedicated IRDY & TRDY pins do have different logic than the other IO pins. But they said that they do not publish the difference and that only Xilinx can use it. Sincerely Daniel DeConinckArticle: 40818
I do not have a logic analyzer. I plan to implement one within the Xilinx chip. Sincerely Daniel DeConinckArticle: 40819
My personal preference is to use "FD" everywhere, and skip the special symbols for IOB use. Then, slap an IOB=TRUE attribute on the instances you want packed into the IOB. I prefer to do this attribute slapping in the UCF file, not in the schematic. Please also note that there are IOB flip flop packing rules you must observe. Simply because you put the attribute on the instance does not mean it will happen. For instance, in a given IOB, you need to make sure that all of the flops are running off the same clock... There are other rules, too. Eric Dan wrote: > > Hello, > > The spartan II IOBs have three FFs. One for data in, one for data out and > one for the tristate control signal. > > My design entry is schematic based ( Viewdraw) > To use the data flip flops in the IOBs I use IFD instead of just FD and the > tools know to place the FF in the IOB. But there is no symbol for the > tristate control FF. I tried using the constraint IOB=TRUE on a FF but this > failed to use the IOB FF. > > Help Please. > > Sincerely > Daniel DeConinckArticle: 40820
Hi Kevin, First off, thanks for lots of great input. >Right, not all inputs can be registered So which ones can not be registered ? >I finally figured out a way to activate that mysterious PCILOGIC thing >I can E-mail the sample code to anyone interested. I accept your offer support@pixelsmart.com Tsu < 7ns. I assumed that this applied to PCI input signals. I use a dedicated clock input and the high speed clock routing. All input FFs are in the IOB and get latched at the same time. I thought that was all that could be done and all that should be done to input PCI signals. I get the feeling that I am missing the point on Tsu. What am I missing ? Tsu: where is this measured from and to ? >I heard some chipsets don't let PCI devices that don't implement configuration registers function at all by cutting off clock supply to it. I heard the same thing about the PCI clock being disabled. >What are the chipsets do the motherboards you got have? I have not even looked at my mother board chipset. >You should never assume that burst read cycle won't occur. I have always heard people complaining that wintel platforms do not busrt read. Is this just a myth ? Do some of them acctually burst read ? I use viewdraw. I will eventually learn VHDL. I get more embarrassed each day about using schematics. I would love to get my PCI logic design working rather than using another. Having your own is very very flexible. I do not have a bus busy state. But I only start a transaction if the command is a memory read/write AND my address space is decoded. Shouldn't this be enough to prevent me from starting a transaction that is not for me. >In PCI, FRAME# = 'L' and IRDY# = 'H' signals the start of a transaction, I think of the start of a transaction in another way. It is the first clock that FRAME# goes low. This is when I latch the command nibble and the address. I ignore IRDY# when making this determination. Shouldn't this be sufficient to prevent me from claiming a transaction in error ? Thanks again Kevin. Dan DeConinckArticle: 40821
I have other uses for the clock buffers, like bringing clocks into the chip. In fact, I would like to have six low skew clock inputs, but there are only four on the XC2Se I will be using. John_H wrote: > > When I quoted the zero delay buffers, I was trying to point you back into the > FPGA. The DLLs and DCMs can produce clocks that are very nicely phase related to > a single clock, duplicating the functionality of a zero delay buffer without an > external part. > > rickman wrote: > > > I did consider using a zero delay buffer. But this board is very tight > > for space and I have a hard time justifying it with 1.5 inch traces. > > But if the simulation shows a problem, of course we will do what we have > > to. > > > > John_H wrote: > > > > > > Are independent clocks from the FPGA something you want to avoid? "Zero > > > delay buffers" are part of the clock management's best application. It's > > > often better from a debug standpoint to have access to individual > > > terminations if things go desperately wrong. -- Rick "rickman" Collins rick.collins@XYarius.com Ignore the reply address. To email me use the above address with the XY removed. Arius - A Signal Processing Solutions Company Specializing in DSP and FPGA design URL http://www.arius.com 4 King Ave 301-682-7772 Voice Frederick, MD 21701-3110 301-682-7666 FAXArticle: 40822
Hi Eric, Good point about all the FFs in an IOB needing the same clk. Well, my IOB=TRUE constraint did not get implemented. Do you know what else might have caused this problem ? Sincerely Daniel DeConinck www.PixelSmart.com TEL: 416-248-4473Article: 40823
>Tsu < 7ns. I assumed that this applied to PCI input signals. I use a >dedicated clock input and the high speed clock routing. All input FFs are in >the IOB and get latched at the same time. I thought that was all that could >be done and all that should be done to input PCI signals. I get the feeling >that I am missing the point on Tsu. What am I missing ? >Tsu: where is this measured from and to ? I'm far from a PCI wizard. If you are asking that question, you need to read the PCI specs a few more times. You are probably missing something important. Tsu is the setup time on signals from the other end before your PCI clock. A few of the control signals need to go through some logic before some signals get latched. The normal hard example is IRDY/TRDY on a burst read from your chip - read by the other chip or write by your chip. You have to have your data in the IOB FFs in order to meet timing. You have to clock in the new data if you want to go at full speed. You have to not clock in the new data (which would trash the current data) if the other end doesn't have its IRDY/TRDY ready. It may not get to your chip (Tsu) until 7ns ahead of the clock. You can't wait until the next cycle to make the decision. Notice that TRDY and IRDY go into the magic PCI block and that the output goes where the IOBs can use it for a clock enable. You might be able to avoid logic ahead of the FFs (clock everything in IOBs) if you are only doing single cycle transfers. I doubt it. (But again, I'm not a wizard.) > >What are the chipsets do the motherboards you got have? > I have not even looked at my mother board chipset. I remember some comment from a long time ago about a popular chipset being buggy. It was easy to program around if you knew what to do. All "real" PCI implementations needed to "support" that kludge. -- These are my opinions, not necessarily my employer's. I hate spam.Article: 40824
>I guess I am a little green with clocks above 50 MHz. I was expecting >runs this short to be pretty simple and not to have to do too much to >make it work. But with the input from Austin and yourself as well as >some others, I do at least plan to take a first pass at a simulation. >My main concern is that you have to know a lot of details about the >board to run a USEFUL simulation. I have learned a lot from reading >some of Bob Pease's articles and I fully realize that a simulation won't >do me a lick of good if I don't make all the right assumptions. Clock frequency isn't the critical parameter. You need to worry about edge rate. You would have the same troubles if you tried to run your collection of chips at half speed. If the round trip time is less than 1/Nth of your transition time, then you can treat everything as a lumped capacitor. Also remember that transmission lines go much slower if you add lumped capacitors along them. For things like this, I highly recommend: High-Speed Digital Design - A Handbook of Black Magic by Johnson and Graham The examples are a bit out of date now, but the methods of thinking about a problem are correct. It's a very educational book - fun to read and (generally) easy to understand. It's the sort of book that I can pick up to check something and get sucked into reading another chapter or two just because I see an interesting graph abd stop to check it out. (like stopping to chat when you meet an old friend) A more modern version is: High-Speed Digital System Design A Handbook of Interconnect Theory and Design Practices by Hall, Hall, and McCall I'm not as familiar with this. It looks good, but doesn't seem to be as much fun to read. Lots of good/new stuff. -- These are my opinions, not necessarily my employer's. I hate spam.
Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources
Threads starting:
Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z