Messages from 40800

Article: 40800
Subject: Re: powerpc in virtex2pro
From: "Tim" <tim@rockylogic.com.nooospam.com>
Date: Fri, 15 Mar 2002 20:16:53 -0000
Links: << >> << T >> << A >>

Peter Alfke wrote

> The advantage of  built-in PowerPC microprocessor is that it connects
> very well to the logic fabric ( the CLBs, BlockRAMs, etc.)  In
> Virtex-IIPro, each PPC has about 700 connections to the fabric, with
> several 64-wide busses.
> Obviously, you could use an external PPC, but that would not only mean an
> additional package, it would also mean many hundreds of FPGA pins being
> wasted on interfacing to the external PPC. More space, more power, less
> reliability, and most likely lower system performance.
> The tight and flexible connection between PPC and the logic is the
> biggest advantage.

It looks as if the '405 consumes the space of 512 LUTs, ignoring
any dedicated layers on the chip.  512 LUTs is midway between an
XC2S30 and an XC2S50.  But maybe the design was harder than a
typical XC2S50 implementation :-)

Article: 40801
Subject: Re: High speed clock routing
From: John_H <johnhandwork@mail.com>
Date: Fri, 15 Mar 2002 20:39:20 GMT
Links: << >> << T >> << A >>

85 ps per inch works for free space.  A better approximation would be 170 ps
per inch for internal traces where the relative permitivity of about 4 for
FR4 material at high frequency gives a good approximation (sqrt(4) for
scaling to free space).  The outside traces are a bit faster because they're
propagating through a combination of air and FR4 - I don't have the numbers
handy for those speeds.

You're right about the stackup being important - the trace widths and plane
spacings need to be well specified by you to get the board house to provide
impedances that won't over/undershoot.  A little mistermination is fine -
the mid-transition reversals are what kills;  those can occur when the
driver sees a low impedance for much of the risetime but gets the reflection
coming back before the clock's out of the transition region.  There's the
beauty of SI - this general info gets applied.

With everything close and distributed capacitance throughout, you could get
smooth transitions with the star configuration, but it's dependent on the
drivers and input capacitance.

Are independent clocks from the FPGA something you want to avoid?  "Zero
delay buffers" are part of the clock management's best application.  It's
often better from a debug standpoint to have access to individual
terminations if things go desperately wrong.



rickman wrote:

> Austin, thanks for the simulation.
>
> This looks like great data.  But I am not sure if you were trying to
> help by doing my simulation for me, or if you were just trying to show
> what the tool can do.
>
> I am not clear about what this is simulating.  Obviously you used the
> daisy chain model, but how do you know what to use as a trace impedance
> and where did the delays come from?  The preliminary layout I am using
> has the following delays in the daisy chain case, assuming 100pS per
> inch.  Is that a valid assumption?
>
> DSP to FPGA       100pS
> FPGA to SDRAM1     50pS
> SDRAM1 to SDRAM2   50pS
> SDRAM2 to SBSRAM  100pS
>
> Don't I need to caclulate the trace impedance from the PCB design
> rules?  The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8
> layers with a total thickness of 0.062".  Of course, I can use wider
> traces for the clock and control which layer they are on.
>
> I would expect these four loads to behave much better than the five
> loads with 200+ pS delays.
>
> If you were just trying to demonstate the tool, that's fine.  But if you
> were trying to simulate my case, these are the data that should be
> used.
>
> When I am done my other work today, I will try downloading the software
> and giving it a try this weekend or next week.
>
> Austin Lesea wrote:
> >
> > Rick,
> >
> > [Image]
> >
> > Parallel termination (shown above) is great for daisy chained clocks.
> > Of course, you have to deal with the timing, and the delays (or
> > skews).
> >
> > Another great thing that is easy to do in HyperLynx using IBIS models.
> >
> > Austin
> >
> > PS:
> >
> > Here is no termination ....
> > [Image]
> >
> > Note some devices don't get any clocks at all .....
> >
> >
> > rickman wrote:
> >
> > > I need to plan a high speed bus that will connect 5 devices.  They
> > > will
> > > all be very closely spaced so that the lengths of the routes can be
> > > kept
> > > pretty short.  The clock line is the one I am most concerned about.
> > > It
> > > is 100 MHz ECLKOUT from a TI C6711 DSP.  The five devices are an
> > > SBSRAM,
> > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E.
> > >
> > > The longest as-the-crow-flys run is 1.4" with 1 inch x and 1 inch y
> > > if
> > > you keep it square (as layout guys like to do).  The other signals
> > > are
> > > within the box these two points inscribe.
> > >
> > > Another approach would be to daisy chain them which would make the
> > > total
> > > run about 3 inches.  What type of termination could I expect to work
> > >
> > > well with this type of run?
> > >
> > > With such short runs, I was thinking about using no termination with
> > > a
> > > star topology.  I am not even sure I need to worry about keeping the
> > > net
> > > delays equal since the variation will be less than +- 1 inch or
> > > about
> > > 100 pS of clock skew.
> > >
> > > Anyone have much experience with running high speed clocks on such
> > > short
> > > runs?  Can I expect this to work well?
> > >
> > > I know Austin will tell me to simulate it, which I plan to do.  I am
> > >
> > > just trying to get a "gut" feeling as Bob Pease would want to do.
> > > You
> > > know how easy it is to get the WRONG, right answer from a computer.
> > > GIGO.
> > >
> > > --
> > >
> > > Rick "rickman" Collins
> > >
> > > rick.collins@XYarius.com
> > > Ignore the reply address. To email me use the above address with the
> > > XY
> > > removed.
> > >
> > > Arius - A Signal Processing Solutions Company
> > > Specializing in DSP and FPGA design      URL http://www.arius.com
> > > 4 King Ave                               301-682-7772 Voice
> > > Frederick, MD 21701-3110                 301-682-7666 FAX
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 40802
Subject: Re: High speed clock routing
From: Magnus Homann <d0asta@mis.dtek.chalmers.se>
Date: 15 Mar 2002 22:17:43 +0100
Links: << >> << T >> << A >>

rickman <spamgoeshere4@yahoo.com> writes:

> Magnus Homann wrote:
> > 
> > rickman <spamgoeshere4@yahoo.com> writes:
> > 
> > > I need to plan a high speed bus that will connect 5 devices.  They will
> > > all be very closely spaced so that the lengths of the routes can be kept
> > > pretty short.  The clock line is the one I am most concerned about.  It
> > > is 100 MHz ECLKOUT from a TI C6711 DSP.  The five devices are an SBSRAM,
> > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E.
> > 
> > Is this differential? In that case I would go for daisychaining and
> > termination at the end. SHORT stubs at intermediate devices.
> > 
> > Homann
> 
> No this is not differential.  This is LVTTL.

Ah, in that case I would ask my boss to be put on another project...

Or use a zero-delay clock buffer (PLL/DLL), if possible.

> BTW, what do you mean by
> SHORT?  Is that anything like telling someone to pay CAREFULL attention
> to signal routing?  :)  

EXACTLY :-)

Homann
-- 
Magnus Homann, M.Sc. CS & E
d0asta@dtek.chalmers.se

Article: 40803
Subject: Re: PCI design in a Spartan II which crashes in some wintel PCs
From: Kevin Brace <ihatespam99kevinbraceusenet@ihatespam99hotmail.com>
Date: Fri, 15 Mar 2002 16:14:53 -0600
Links: << >> << T >> << A >>

Dan wrote:
> 
> Hello,
> 
> I have designed my own PCI logic for a target board.(33/32) It works in the
> majority of wintel PCs but crashes in a significant number of PCs.
> 

        When I first fired up my own synthesizable vendor independent
Verilog RTL PCI IP core in Insight Electronics Spartan-II PCI
development board, I got setup time (Tsu) of about 11ns, and
clock-to-output valid time (Tval) of about 15.5ns.
I tested the card with Intel 430TX chipset-based motherboard and SiS
5598 chipset-based motherboard, and at least the configuration register
access part worked okay.
The I/O read/write part didn't work at all (crashed the computer), but
eventually I figured out the problem through RTL and Post P&R
simulation.
Since then, I have improved my logic design skills significantly, so
meeting 33MHz PCI's Tsu < 7ns is very easy after some manual
floorplanning.
I didn't use any special EDA tools that cost thousands of dollars to do
that.
I only can afford ISE WebPACK 4.1, so I synthesized my design with XST
(Xilinx Synthesis Technology), and simulated it (Post P&R simulation)
with ModelSim XE-Starter.


> I have implemented a design in a Xilinx PCI proto board made by Insight.
> This way I can assume that the PCB fabrication is sound.
> 

        I don't have any PCB design experience, but looking at the
component quality of the board, I can at least say that it is far better
than most PCI cards I have seen sold at computer stores/dealers.



> I feel the problem comes down to the way TRDY# and DEVSEL# are being driven.
> This is the logic that must be improved.
> 
> The crashing occurs with reads. With my exisiting logic one motherboard may
> crash while another is ok. On a motherboard that is ok, the addition of a
> certain 3rd party PCI card will then result in  crashes. The logic I have
> must be on the verge of being PCI compliant. I expect that one little tweak
> should be enough to clear up all my problems.
> 

        What is the Tsu of your PCI interface design?
Yes, mine sort of worked okay at 11ns, but if it is worst than that,
things might start to go wrong.
Plus, I heard that more loading on the bus will make the signals slow
down, so that might be another reason the PCI card failed.
Also, have you used an oscilloscope or a logic analyzer to look at the
waveform?
In my case, I couldn't afford either one of them, so I relied heavily on
ModelSim XE-Starter's Post P&R simulation feature where I saw XST
messing up the synthesis.
After I turned off some optimization options, the Post P&R simulation
went okay, so I put the board into a computer, and it worked absolutely
fine.
Yes, the Tsu was still 11ns . . .
The PCI bus had a PCI graphics card other than the Spartan-II PCI card.



> In tracking down the problem I have removed more and more logic to simplify
> the design and to narrow in on the cause. Almost all the logic is now gone.
> All that remains is:
> 
> OUTPUTS:
> TRDY#
> DEVSEL#
> 
> INPUTS:
> FRAME#
> IRDY#
> AD[31:00]
> C/BE[3:0]#
> CLK
> 
> In this stripped down implementation there are no bursts, no parity, no
> master logic, no configuration space ( which is not needed to effect
> reads/writes if you know of a conflict free address, which I do for test
> purposes )

        Implementing the Configuration Address Space or Configuration
Registers is a requirement of the PCI specification.
Although a lot of PCI devices don't bother to detect parity errors
(address or data), parity generation is a must in a read cycle.
If you don't do that, in some systems that check for parity errors, the
host PCI bridge might be asserting SERR# that might cause NMI (Non
Maskable Interrupt) to shut the computer down.
I heard some chipsets don't let PCI devices that don't implement
configuration registers function at all by cutting off clock supply to
it.
What are the chipsets do the motherboards you got have?



> The logic is very simple. When a read or write is decoded I take DEVSEL#
> active low, followed by TRDY#. When IRDY# is seen low I release TRDY# and
> DEVESEL# on the next clock.(wintels do not do burst reads so I know FRAME#
> will indicate a single data phase cycle)


        I think you are making poor assumptions here.
You should never assume that burst read cycle won't occur.
Even if burst read cycle won't occur, write burst cycles known to occur
in x86 host PCI bridges, and to safeguard against that, you must assert
STOP# for each transaction.
You can assert STOP# blindly simultaneously with TRDY# assertion, but
you will have likely add another state machine state called backoff to
wait until FRAME# is deasserted.
In backoff state TRDY# will have to be deasserted while STOP# being
asserted until FRAME# is deasserted.
See PCI specification Appendix B's state machine example for what I am
talking about.



>  This complete test is a trivial
> and small piece of logic. ( For anyone designing their own PCI logic this is
> an excellent first step to try. Once this works then you would go on to add
> other features.)  Note that I am not even driving AD[31:00] on the reads. So
> the only PCI signals that I drive in response to a decoded PCI read in my
> address space is DEVSEL# and TRDY#.
> 
> When a crash occurs it happens on a read but not every read so as you can
> see this is erractic.

        It sounds like you didn't simulate the PCI interface before
firing it up.
Mine didn't work perfectly the first time because I didn't feel like
simulating it (I was so anxious to fire it up.).
Nowadays, I will never burn a Configuration PROM without doing Post P&R
simulation, and making sure the synthesis tool synthesized the design
correctly.



> Note: all PCI input and output signals are clocked.
> 
> My schematics will be provided to anyone who requests them.
> 

        What schematics software did you use?
Have you considered using HDL?
Even when using synthesizable Verilog RTL code, I got the levels of LUT
fairly low enough to meet 33MHz PCI's Tsu after some floorplanning, so
it is possible to implement a PCI IP core in HDL.



> Is there anyone out there who has gone down this road designing their own
> PCI logic for a FPGA ? Come on over. I have a plane ticket for you. Name
> your price.
> 
> Sincerely
> Daniel DeConinck
> www.PixelSmart.com
> TEL: 416-248-4473


        Really?
Are you really going to pay someone to look at your design?
Isn't it faster to just pay $2,000 for a Xilinx LogiCORE PCI license for
Spartan-II?
Well, I shouldn't really say this since mine isn't completely done, but
if you are going to pay something, I won't mind letting you use my still
beta version of PCI IP core.
The "something", of course, will be far less than $2,000.
If you are interested, let me know that.
        You may also want to take a look at opencores.org's free PCI IP
core project.

http://www.opencores.org/projects/pci


My biased opinion (Because it is always easier to understand your own
code than someone else's.) of the code I saw is that the style they
wrote the code (gate level like HDL code) makes the thing really hard to
understand other than the original authors.
Well, it is free, so it doesn't hurt to take a look at it though, but I
don't think you should expect too much.
That is just my biased opinion.



Kevin Brace

Article: 40804
Subject: Re: High speed clock routing
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Fri, 15 Mar 2002 14:42:14 -0800
Links: << >> << T >> << A >>

Rick,

Just showing an example, you need your own IBIS models for each driver and
receiver, and of course, your pcb trace lengths and impedances, or their
widths and spacing, and the pcb stackup.

Austin

rickman wrote:

> Austin, thanks for the simulation.
>
> This looks like great data.  But I am not sure if you were trying to
> help by doing my simulation for me, or if you were just trying to show
> what the tool can do.
>
> I am not clear about what this is simulating.  Obviously you used the
> daisy chain model, but how do you know what to use as a trace impedance
> and where did the delays come from?  The preliminary layout I am using
> has the following delays in the daisy chain case, assuming 100pS per
> inch.  Is that a valid assumption?
>
> DSP to FPGA       100pS
> FPGA to SDRAM1     50pS
> SDRAM1 to SDRAM2   50pS
> SDRAM2 to SBSRAM  100pS
>
> Don't I need to caclulate the trace impedance from the PCB design
> rules?  The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8
> layers with a total thickness of 0.062".  Of course, I can use wider
> traces for the clock and control which layer they are on.
>
> I would expect these four loads to behave much better than the five
> loads with 200+ pS delays.
>
> If you were just trying to demonstate the tool, that's fine.  But if you
> were trying to simulate my case, these are the data that should be
> used.
>
> When I am done my other work today, I will try downloading the software
> and giving it a try this weekend or next week.
>
> Austin Lesea wrote:
> >
> > Rick,
> >
> > [Image]
> >
> > Parallel termination (shown above) is great for daisy chained clocks.
> > Of course, you have to deal with the timing, and the delays (or
> > skews).
> >
> > Another great thing that is easy to do in HyperLynx using IBIS models.
> >
> > Austin
> >
> > PS:
> >
> > Here is no termination ....
> > [Image]
> >
> > Note some devices don't get any clocks at all .....
> >
> >
> > rickman wrote:
> >
> > > I need to plan a high speed bus that will connect 5 devices.  They
> > > will
> > > all be very closely spaced so that the lengths of the routes can be
> > > kept
> > > pretty short.  The clock line is the one I am most concerned about.
> > > It
> > > is 100 MHz ECLKOUT from a TI C6711 DSP.  The five devices are an
> > > SBSRAM,
> > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E.
> > >
> > > The longest as-the-crow-flys run is 1.4" with 1 inch x and 1 inch y
> > > if
> > > you keep it square (as layout guys like to do).  The other signals
> > > are
> > > within the box these two points inscribe.
> > >
> > > Another approach would be to daisy chain them which would make the
> > > total
> > > run about 3 inches.  What type of termination could I expect to work
> > >
> > > well with this type of run?
> > >
> > > With such short runs, I was thinking about using no termination with
> > > a
> > > star topology.  I am not even sure I need to worry about keeping the
> > > net
> > > delays equal since the variation will be less than +- 1 inch or
> > > about
> > > 100 pS of clock skew.
> > >
> > > Anyone have much experience with running high speed clocks on such
> > > short
> > > runs?  Can I expect this to work well?
> > >
> > > I know Austin will tell me to simulate it, which I plan to do.  I am
> > >
> > > just trying to get a "gut" feeling as Bob Pease would want to do.
> > > You
> > > know how easy it is to get the WRONG, right answer from a computer.
> > > GIGO.
> > >
> > > --
> > >
> > > Rick "rickman" Collins
> > >
> > > rick.collins@XYarius.com
> > > Ignore the reply address. To email me use the above address with the
> > > XY
> > > removed.
> > >
> > > Arius - A Signal Processing Solutions Company
> > > Specializing in DSP and FPGA design      URL http://www.arius.com
> > > 4 King Ave                               301-682-7772 Voice
> > > Frederick, MD 21701-3110                 301-682-7666 FAX
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 40805
Subject: Re: High speed clock routing
From: "Falk Brunner" <Falk.Brunner@gmx.de>
Date: Sat, 16 Mar 2002 00:14:38 +0100
Links: << >> << T >> << A >>

"Magnus Homann" <d0asta@mis.dtek.chalmers.se> schrieb im Newsbeitrag
news:ltsn71cyyw.fsf@mis.dtek.chalmers.se...
> > > Is this differential? In that case I would go for daisychaining and
> > > termination at the end. SHORT stubs at intermediate devices.
> > >
> > > Homann
> >
> > No this is not differential.  This is LVTTL.
>
> Ah, in that case I would ask my boss to be put on another project...

You are a big Sissy.

SCNR.  ;-))

A little bit more serious, is a 100 MHz LVTTL clock propagating some inches
on a FR4 board that difficult to handle?? I mean, sure, lots of things can
go wrong if you dont know what you are doing, BUT
Two guys in our company designed a board with a big communication processor,
with 3 fast SDRAM/SSRAM/ZBTRAM busses (100-133 MHz). They did NO simulation,
"just" had an eye at the layout and followed the basic guidelines that apply
on this kind of stuff. And you wont believe it, it worked on the first run,
almost perfect, just some minor modification of termination resistors and
some clock line (length) modification.
Your comment, Austin?? ;-)

> Or use a zero-delay clock buffer (PLL/DLL), if possible.

No problem, there are at least 4 inside the FPGA, Virtex-E/-II has even
more.

--
MfG
Falk

Article: 40806
Subject: Re: High speed clock routing
From: rickman <spamgoeshere4@yahoo.com>
Date: Fri, 15 Mar 2002 18:22:57 -0500
Links: << >> << T >> << A >>

Thanks for your comments John.  

I guess I am a little green with clocks above 50 MHz.  I was expecting
runs this short to be pretty simple and not to have to do too much to
make it work.  But with the input from Austin and yourself as well as
some others, I do at least plan to take a first pass at a simulation. 
My main concern is that you have to know a lot of details about the
board to run a USEFUL simulation.  I have learned a lot from reading
some of Bob Pease's articles and I fully realize that a simulation won't
do me a lick of good if I don't make all the right assumptions. 

I have been working on a very tightly packed switching power supply
while I am doing the digital stuff and I am finding that I can make it
look pretty feasible if I make THESE assumptions and I can make it look
pretty impossible if I make THOSE assumptions.  I think we won't really
know how well it will work until we fire it up on the final board
layout.  I expect that we will see the same sort of thing with this
clock design. 

I did consider using a zero delay buffer.  But this board is very tight
for space and I have a hard time justifying it with 1.5 inch traces. 
But if the simulation shows a problem, of course we will do what we have
to.  

The SDRAM and SBSRAM are 4 pF input cap max, the FPGA says 8 pF max. 
This is another difference between the simulation Austin did and what I
have.  He seems to have used all VII inputs with 10 pF capacitance.  

It is also not clear to me if the simulations are being done with
typical values or worst case values.  If typ values are used, then I
don't see how the results have any meaning at all.  

Rick Collins

John_H wrote:
> 
> 85 ps per inch works for free space.  A better approximation would be 170 ps
> per inch for internal traces where the relative permitivity of about 4 for
> FR4 material at high frequency gives a good approximation (sqrt(4) for
> scaling to free space).  The outside traces are a bit faster because they're
> propagating through a combination of air and FR4 - I don't have the numbers
> handy for those speeds.
> 
> You're right about the stackup being important - the trace widths and plane
> spacings need to be well specified by you to get the board house to provide
> impedances that won't over/undershoot.  A little mistermination is fine -
> the mid-transition reversals are what kills;  those can occur when the
> driver sees a low impedance for much of the risetime but gets the reflection
> coming back before the clock's out of the transition region.  There's the
> beauty of SI - this general info gets applied.
> 
> With everything close and distributed capacitance throughout, you could get
> smooth transitions with the star configuration, but it's dependent on the
> drivers and input capacitance.
> 
> Are independent clocks from the FPGA something you want to avoid?  "Zero
> delay buffers" are part of the clock management's best application.  It's
> often better from a debug standpoint to have access to individual
> terminations if things go desperately wrong.
> 
> rickman wrote:
> 
> > Austin, thanks for the simulation.
> >
> > This looks like great data.  But I am not sure if you were trying to
> > help by doing my simulation for me, or if you were just trying to show
> > what the tool can do.
> >
> > I am not clear about what this is simulating.  Obviously you used the
> > daisy chain model, but how do you know what to use as a trace impedance
> > and where did the delays come from?  The preliminary layout I am using
> > has the following delays in the daisy chain case, assuming 100pS per
> > inch.  Is that a valid assumption?
> >
> > DSP to FPGA       100pS
> > FPGA to SDRAM1     50pS
> > SDRAM1 to SDRAM2   50pS
> > SDRAM2 to SBSRAM  100pS
> >
> > Don't I need to caclulate the trace impedance from the PCB design
> > rules?  The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8
> > layers with a total thickness of 0.062".  Of course, I can use wider
> > traces for the clock and control which layer they are on.
> >
> > I would expect these four loads to behave much better than the five
> > loads with 200+ pS delays.
> >
> > If you were just trying to demonstate the tool, that's fine.  But if you
> > were trying to simulate my case, these are the data that should be
> > used.
> >
> > When I am done my other work today, I will try downloading the software
> > and giving it a try this weekend or next week.

...snip...

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 40807
Subject: Re: PCI design in a Spartan II which crashes in some wintel PCs
From: Kevin Brace <ihatespam99kevinbraceusenet@ihatespam99hotmail.com>
Date: Fri, 15 Mar 2002 17:26:46 -0600
Links: << >> << T >> << A >>

Falk Brunner wrote:
> 
> 
> ??? AFAIK not all PCI inputs can be registered.
> 

        Right, not all inputs can be registered, and that's the hardest
part of a PCI IP core implementation I think.


> 
> I dont. But Kevin is doing so since a while. And in his threads, it came
> out, that the timing of TRDY and IRDY is critical, and Xilinx provides
> special macros to implement this logic, some kind of black magic voodoo-box
> ;-))
> Have a look at the pinouts, they name TRDY and IRDY a special IO-Pins.
> 
> Just my 2 EURO.cents ;-)
> --
> MfG
> Falk

        About a month ago, I finally figured out a way to activate that
mysterious PCILOGIC thing people were talking about following someone's
analysis on how to instantiate it.
I did it from ISE WebPACK 4.1 without using FPGA Editor since Xilinx
doesn't include it.
I can E-mail the sample code to anyone interested.
After all, the PCILOGIC is just a logic that generates CE (Clock Enable)
for the datapath (internal data source to AD[31:0] output FF), and a
5-input LUT or two 4-input LUTs cascaded can emulate that, but it will
be slower because routing delay will be less predictable.
So, I attached PCILOGIC to my PCI IP core, and the thing worked fine in
Post P&R simulation.
I was lazy, so I didn't actually test in a real system though.
But this supposedly magic box won't solve the problem of keeping levels
of LUT low for control signals (FRAME#, IRDY#, DEVSEL#, TRDY#, and
STOP#), and I think that is the hardest part of a PCI IP core design
because it will impact setup time.
For Spartan-II PQ208 package, pin 24 is for IRDY# and pin 27 is for
TRDY#.
The opposite side of the chip also has another PCILOGIC, too.



Kevin Brace

Article: 40808
Subject: Re: Xilinix FPGA with 5V IO
From: rickman <spamgoeshere4@yahoo.com>
Date: Fri, 15 Mar 2002 18:41:40 -0500
Links: << >> << T >> << A >>

Falk Brunner wrote:
> 
> "Austin Lesea" <austin.lesea@xilinx.com> schrieb im Newsbeitrag
> news:3C92136E.B45304E1@xilinx.com...
> > Rick,
> >
> > Virtex II is also "5V compatible" when the current into the pin is less
> than ~ 10 to
> > 12 mA, so the 100 ohms works for Virtex II, too.
> 
> Hm. After all, I think in most cases, you dont need  such a high-tec "toy"
> like Virtex-II when there are still some "old guys" with 5V on your
> board(yes, PCI is one of these exceptions). A Spartan-II will do the job (i
> think . . .). Or some functions and the 5V Interface is put into a
> Spartan-II, and the rest into a Virtex-II (interfacing with the Spartan-II
> at 3.3V or even 2.5V, IO Banking rules ;-).
> 
> Regards
> Falk

I ended up adding a XCR3256XL to my board just so I could interface to a
5 volt bus.  The main FPGA is an XC2S200E which could have been an
XC2S200, but I didn't want to add a fifth power domain just for one
part.  This board only has about 10 ICs on it and it has four power
areas with three voltages.  I may have a fifth when I add the battery
backup :)

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 40809
Subject: Re: High speed clock routing
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Fri, 15 Mar 2002 15:47:30 -0800
Links: << >> << T >> << A >>

Rick,

Use the fast/strong IBIS corner, and that covers the worst case for
process/voltage/temperature.

The capactances don't matter all that much (convince yourself by changing them,
and you will see that the results don't change much at all).

The claim of IBIS simulator vendors is that it saves all of the PCB spins to fix
SI, and they are right.  The sims are not that fussy, and all you need to be sure
of are the parts and their models, and the PCB impedance and the lengths.

Input models are not fussy either, hence my use of VII for all inputs is probably
+/- 10% of the real measured result.  All CMOS inputs look pretty much the same.

If you have wide buses, then crosstalk is important, and you need a little more
detail for the geometry.

Austin

rickman wrote:

> Thanks for your comments John.
>
> I guess I am a little green with clocks above 50 MHz.  I was expecting
> runs this short to be pretty simple and not to have to do too much to
> make it work.  But with the input from Austin and yourself as well as
> some others, I do at least plan to take a first pass at a simulation.
> My main concern is that you have to know a lot of details about the
> board to run a USEFUL simulation.  I have learned a lot from reading
> some of Bob Pease's articles and I fully realize that a simulation won't
> do me a lick of good if I don't make all the right assumptions.
>
> I have been working on a very tightly packed switching power supply
> while I am doing the digital stuff and I am finding that I can make it
> look pretty feasible if I make THESE assumptions and I can make it look
> pretty impossible if I make THOSE assumptions.  I think we won't really
> know how well it will work until we fire it up on the final board
> layout.  I expect that we will see the same sort of thing with this
> clock design.
>
> I did consider using a zero delay buffer.  But this board is very tight
> for space and I have a hard time justifying it with 1.5 inch traces.
> But if the simulation shows a problem, of course we will do what we have
> to.
>
> The SDRAM and SBSRAM are 4 pF input cap max, the FPGA says 8 pF max.
> This is another difference between the simulation Austin did and what I
> have.  He seems to have used all VII inputs with 10 pF capacitance.
>
> It is also not clear to me if the simulations are being done with
> typical values or worst case values.  If typ values are used, then I
> don't see how the results have any meaning at all.
>
> Rick Collins
>
> John_H wrote:
> >
> > 85 ps per inch works for free space.  A better approximation would be 170 ps
> > per inch for internal traces where the relative permitivity of about 4 for
> > FR4 material at high frequency gives a good approximation (sqrt(4) for
> > scaling to free space).  The outside traces are a bit faster because they're
> > propagating through a combination of air and FR4 - I don't have the numbers
> > handy for those speeds.
> >
> > You're right about the stackup being important - the trace widths and plane
> > spacings need to be well specified by you to get the board house to provide
> > impedances that won't over/undershoot.  A little mistermination is fine -
> > the mid-transition reversals are what kills;  those can occur when the
> > driver sees a low impedance for much of the risetime but gets the reflection
> > coming back before the clock's out of the transition region.  There's the
> > beauty of SI - this general info gets applied.
> >
> > With everything close and distributed capacitance throughout, you could get
> > smooth transitions with the star configuration, but it's dependent on the
> > drivers and input capacitance.
> >
> > Are independent clocks from the FPGA something you want to avoid?  "Zero
> > delay buffers" are part of the clock management's best application.  It's
> > often better from a debug standpoint to have access to individual
> > terminations if things go desperately wrong.
> >
> > rickman wrote:
> >
> > > Austin, thanks for the simulation.
> > >
> > > This looks like great data.  But I am not sure if you were trying to
> > > help by doing my simulation for me, or if you were just trying to show
> > > what the tool can do.
> > >
> > > I am not clear about what this is simulating.  Obviously you used the
> > > daisy chain model, but how do you know what to use as a trace impedance
> > > and where did the delays come from?  The preliminary layout I am using
> > > has the following delays in the daisy chain case, assuming 100pS per
> > > inch.  Is that a valid assumption?
> > >
> > > DSP to FPGA       100pS
> > > FPGA to SDRAM1     50pS
> > > SDRAM1 to SDRAM2   50pS
> > > SDRAM2 to SBSRAM  100pS
> > >
> > > Don't I need to caclulate the trace impedance from the PCB design
> > > rules?  The PCB will be 5 mil trace and 5 mil space with 6 or possibly 8
> > > layers with a total thickness of 0.062".  Of course, I can use wider
> > > traces for the clock and control which layer they are on.
> > >
> > > I would expect these four loads to behave much better than the five
> > > loads with 200+ pS delays.
> > >
> > > If you were just trying to demonstate the tool, that's fine.  But if you
> > > were trying to simulate my case, these are the data that should be
> > > used.
> > >
> > > When I am done my other work today, I will try downloading the software
> > > and giving it a try this weekend or next week.
>
> ...snip...
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 40810
Subject: Re: High speed clock routing
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Fri, 15 Mar 2002 15:48:21 -0800
Links: << >> << T >> << A >>

Falk,

Yes.

But, if the lengths are all real short, and the rise time fairly slow, perhaps
no termination at all is needed.

Austin

Falk Brunner wrote:

> "Magnus Homann" <d0asta@mis.dtek.chalmers.se> schrieb im Newsbeitrag
> news:ltg03190da.fsf@mis.dtek.chalmers.se...
> > rickman <spamgoeshere4@yahoo.com> writes:
> >
> > > I need to plan a high speed bus that will connect 5 devices.  They will
> > > all be very closely spaced so that the lengths of the routes can be kept
> > > pretty short.  The clock line is the one I am most concerned about.  It
> > > is 100 MHz ECLKOUT from a TI C6711 DSP.  The five devices are an SBSRAM,
> > > two SDRAMs (16 bits each for 32 bit memory) and an XC2S200E.
> >
> >
> > Is this differential? In that case I would go for daisychaining and
> > termination at the end. SHORT stubs at intermediate devices.
>
> Isnt end termination the ONLY clean way when daisy-chaining??? (According to
> the "bible" from  Howard Johnson)
>
> Regards
> Falk

Article: 40811
Subject: Re: Xilinix FPGA with 5V IO
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Fri, 15 Mar 2002 15:52:29 -0800
Links: << >> << T >> << A >>

Falk,

I agree.

Austin

Falk Brunner wrote:

> "Austin Lesea" <austin.lesea@xilinx.com> schrieb im Newsbeitrag
> news:3C92136E.B45304E1@xilinx.com...
> > Rick,
> >
> > Virtex II is also "5V compatible" when the current into the pin is less
> than ~ 10 to
> > 12 mA, so the 100 ohms works for Virtex II, too.
>
> Hm. After all, I think in most cases, you dont need  such a high-tec "toy"
> like Virtex-II when there are still some "old guys" with 5V on your
> board(yes, PCI is one of these exceptions). A Spartan-II will do the job (i
> think . . .). Or some functions and the 5V Interface is put into a
> Spartan-II, and the rest into a Virtex-II (interfacing with the Spartan-II
> at 3.3V or even 2.5V, IO Banking rules ;-).
>
> Regards
> Falk

Article: 40812
Subject: Re: High speed clock routing
From: John_H <johnhandwork@mail.com>
Date: Sat, 16 Mar 2002 00:30:42 GMT
Links: << >> << T >> << A >>

When I quoted the zero delay buffers, I was trying to point you back into the
FPGA.  The DLLs and DCMs can produce clocks that are very nicely phase related to
a single clock, duplicating the functionality of a zero delay buffer without an
external part.

rickman wrote:

> I did consider using a zero delay buffer.  But this board is very tight
> for space and I have a hard time justifying it with 1.5 inch traces.
> But if the simulation shows a problem, of course we will do what we have
> to.
>
> John_H wrote:
> >
> > Are independent clocks from the FPGA something you want to avoid?  "Zero
> > delay buffers" are part of the clock management's best application.  It's
> > often better from a debug standpoint to have access to individual
> > terminations if things go desperately wrong.

Article: 40813
Subject: [Virtex 2] DCM: "Factory_JF" option box in FPGA editor question
From: hooiwai@yahoo.com (J.Ho)
Date: 15 Mar 2002 17:15:50 -0800
Links: << >> << T >> << A >>

Hi all,

If you push into the DCM block inside Xilinx FPGA Editor, you will
find a bunch of option boxes.  What does Factory_JF do?

From Xilinx answer database, it is described as jitter filter
function, but users are discourage to use them.

So...what do they do and what do those values mean?

It would be nice if I can set them to user specific center frequency
for whatever clock input I have, not just a "high" or "low" option.

Jon Ho

Article: 40814
Subject: Re: PCI design in a Spartan II which crashes in some wintel PCs
From: yuryws@optonline.com (Yury)
Date: 15 Mar 2002 18:07:28 -0800
Links: << >> << T >> << A >>

What is the last set of events that takes place on the bus when a crash occurs.
Do you have an access to a logic analyzer?
I extensively used TLA714 with a Newwave adapter + LA software for PCI decoding.
It may save you a lot of time.


-- YWS

Article: 40815
Subject: Re: PCI design in a Spartan II which crashes in some wintel PCs
From: Kevin Brace <ihatespam99kevinbraceusenet@ihatespam99hotmail.com>
Date: Fri, 15 Mar 2002 21:11:20 -0600
Links: << >> << T >> << A >>

Oh, one more thing I thought of.
Does your state machine have a bus busy state?
If it doesn't (In other words, remaining at an idle state if the
transaction is not for itself.), your PCI interface might mistakenly
start a transaction, leading to a crash.
In PCI, FRAME# = 'L' and IRDY# = 'H' signals the start of a transaction,
and this can continue for multiple cycles during the first data phase,
but this condition can also occur if an initiator device is inserting
wait states in the middle of a burst cycle (In the middle = after the
first microaccess. ).




Kevin Brace

Article: 40816
Subject: Spartan II IOB tristate control FF use
From: "Dan" <daniel.deconinck@sympatico.ca>
Date: Fri, 15 Mar 2002 22:20:32 -0500
Links: << >> << T >> << A >>

Hello,

The spartan II IOBs have three FFs. One for data in, one for data out and
one for the tristate control signal.

My design entry is schematic based ( Viewdraw)
To use the data flip flops in the IOBs I use IFD instead of just FD and the
tools know to place the FF in the IOB. But there is no symbol for the
tristate control FF. I tried using the constraint IOB=TRUE on a FF but this
failed to use the IOB FF.

Help Please.

Sincerely
Daniel DeConinck

Article: 40817
Subject: To Falk Brunner
From: "Dan" <daniel.deconinck@sympatico.ca>
Date: Fri, 15 Mar 2002 22:26:30 -0500
Links: << >> << T >> << A >>

Thanks for the input.

Which signals can not be registered ?

Do you mean TRDY and IRDY ?

Xilinx told me that the dedicated IRDY & TRDY pins do have different logic
than the other IO pins. But they said that they do not publish the
difference and that only Xilinx can use it.

Sincerely
Daniel DeConinck

Article: 40818
Subject: To Yury's post
From: "Dan" <daniel.deconinck@sympatico.ca>
Date: Fri, 15 Mar 2002 22:27:23 -0500
Links: << >> << T >> << A >>

I do not have a logic analyzer. I plan to implement one within the Xilinx
chip.

Sincerely
Daniel DeConinck

Article: 40819
Subject: Re: Spartan II IOB tristate control FF use
From: Eric Crabill <eric.crabill@xilinx.com>
Date: Fri, 15 Mar 2002 20:10:07 -0800
Links: << >> << T >> << A >>

My personal preference is to use "FD" everywhere, and skip
the special symbols for IOB use.  Then, slap an IOB=TRUE
attribute on the instances you want packed into the IOB.

I prefer to do this attribute slapping in the UCF file,
not in the schematic.

Please also note that there are IOB flip flop packing
rules you must observe.  Simply because you put the
attribute on the instance does not mean it will happen.

For instance, in a given IOB, you need to make sure that
all of the flops are running off the same clock...  There
are other rules, too.

Eric

Dan wrote:
> 
> Hello,
> 
> The spartan II IOBs have three FFs. One for data in, one for data out and
> one for the tristate control signal.
> 
> My design entry is schematic based ( Viewdraw)
> To use the data flip flops in the IOBs I use IFD instead of just FD and the
> tools know to place the FF in the IOB. But there is no symbol for the
> tristate control FF. I tried using the constraint IOB=TRUE on a FF but this
> failed to use the IOB FF.
> 
> Help Please.
> 
> Sincerely
> Daniel DeConinck

Article: 40820
Subject: Reply to Kevin
From: "Dan" <daniel.deconinck@sympatico.ca>
Date: Fri, 15 Mar 2002 23:15:51 -0500
Links: << >> << T >> << A >>

Hi Kevin,

First off, thanks for lots of great input.

>Right, not all inputs can be registered
So which ones can not be registered ?

>I finally figured out a way to activate that mysterious PCILOGIC thing
>I can E-mail the sample code to anyone interested.
I accept your offer support@pixelsmart.com

Tsu < 7ns. I assumed that this applied to PCI input signals. I use a
dedicated clock input and the high speed clock routing. All input FFs are in
the IOB and get latched at the same time. I thought that was all that could
be done and all that should be done to input PCI signals. I get the feeling
that I am missing the point on Tsu. What am I missing ?
Tsu: where is this measured from and to ?

>I heard some chipsets don't let PCI devices that don't implement
configuration registers function at all by cutting off clock supply to it.

I heard the same thing about the PCI clock being disabled.

>What are the chipsets do the motherboards you got have?
I have not even looked at my mother board chipset.

>You should never assume that burst read cycle won't occur.
I have always heard people complaining that wintel platforms do not busrt
read. Is this just a myth ? Do some of them acctually burst read ?

I use viewdraw. I will eventually learn VHDL. I get more embarrassed each
day about using schematics.

I would love to get my PCI logic design working rather than using another.
Having your own is very very flexible.

I do not have a bus busy state. But I only start a transaction if the
command is a memory read/write AND my address space is decoded. Shouldn't
this be enough to prevent me from starting a transaction that is not for me.

>In PCI, FRAME# = 'L' and IRDY# = 'H' signals the start of a transaction,
I think of the start of a transaction in another way. It is the first clock
that FRAME# goes low. This is when I latch the command nibble and the
address. I ignore IRDY# when making this determination. Shouldn't this be
sufficient to prevent me from claiming a transaction in error ?

Thanks again Kevin.

Dan DeConinck

Article: 40821
Subject: Re: High speed clock routing
From: rickman <spamgoeshere4@yahoo.com>
Date: Sat, 16 Mar 2002 00:42:42 -0500
Links: << >> << T >> << A >>

I have other uses for the clock buffers, like bringing clocks into the
chip.  In fact, I would like to have six low skew clock inputs, but
there are only four on the XC2Se I will be using.  


John_H wrote:
> 
> When I quoted the zero delay buffers, I was trying to point you back into the
> FPGA.  The DLLs and DCMs can produce clocks that are very nicely phase related to
> a single clock, duplicating the functionality of a zero delay buffer without an
> external part.
> 
> rickman wrote:
> 
> > I did consider using a zero delay buffer.  But this board is very tight
> > for space and I have a hard time justifying it with 1.5 inch traces.
> > But if the simulation shows a problem, of course we will do what we have
> > to.
> >
> > John_H wrote:
> > >
> > > Are independent clocks from the FPGA something you want to avoid?  "Zero
> > > delay buffers" are part of the clock management's best application.  It's
> > > often better from a debug standpoint to have access to individual
> > > terminations if things go desperately wrong.

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 40822
Subject: Re: Spartan II IOB tristate control FF use
From: "Dan" <daniel.deconinck@sympatico.ca>
Date: Sat, 16 Mar 2002 01:01:04 -0500
Links: << >> << T >> << A >>

Hi Eric,

Good point about all the FFs in an IOB needing the same clk.

Well, my IOB=TRUE constraint did not get implemented. Do you know what else
might have caused this problem ?

Sincerely
Daniel DeConinck
www.PixelSmart.com
TEL: 416-248-4473

Article: 40823
Subject: Re: Reply to Kevin
From: hmurray-nospam@megapathdsl.net (Hal Murray)
Date: Sat, 16 Mar 2002 06:04:21 -0000
Links: << >> << T >> << A >>

>Tsu < 7ns. I assumed that this applied to PCI input signals. I use a
>dedicated clock input and the high speed clock routing. All input FFs are in
>the IOB and get latched at the same time. I thought that was all that could
>be done and all that should be done to input PCI signals. I get the feeling
>that I am missing the point on Tsu. What am I missing ?
>Tsu: where is this measured from and to ?

I'm far from a PCI wizard.

If you are asking that question, you need to read the PCI
specs a few more times.  You are probably missing something
important.

Tsu is the setup time on signals from the other end before your PCI
clock.  A few of the control signals need to go through some logic
before some signals get latched.

The normal hard example is IRDY/TRDY on a burst read from your
chip - read by the other chip or write by your chip.  You have to
have your data in the IOB FFs in order to meet timing.  You have
to clock in the new data if you want to go at full speed.  You
have to not clock in the new data (which would trash the current
data) if the other end doesn't have its IRDY/TRDY ready.  It
may not get to your chip (Tsu) until 7ns ahead of the clock.
You can't wait until the next cycle to make the decision.

Notice that TRDY and IRDY go into the magic PCI block and that
the output goes where the IOBs can use it for a clock enable.


You might be able to avoid logic ahead of the FFs (clock everything
in IOBs) if you are only doing single cycle transfers.  I doubt it.
(But again, I'm not a wizard.)


> >What are the chipsets do the motherboards you got have?
> I have not even looked at my mother board chipset.

I remember some comment from a long time ago about a popular
chipset being buggy.  It was easy to program around if you
knew what to do.  All "real" PCI implementations needed to
"support" that kludge.

-- 
These are my opinions, not necessarily my employer's.  I hate spam.

Article: 40824
Subject: Re: High speed clock routing
From: hmurray-nospam@megapathdsl.net (Hal Murray)
Date: Sat, 16 Mar 2002 06:20:46 -0000
Links: << >> << T >> << A >>


>I guess I am a little green with clocks above 50 MHz.  I was expecting
>runs this short to be pretty simple and not to have to do too much to
>make it work.  But with the input from Austin and yourself as well as
>some others, I do at least plan to take a first pass at a simulation. 
>My main concern is that you have to know a lot of details about the
>board to run a USEFUL simulation.  I have learned a lot from reading
>some of Bob Pease's articles and I fully realize that a simulation won't
>do me a lick of good if I don't make all the right assumptions. 

Clock frequency isn't the critical parameter.  You need to worry about
edge rate.  You would have the same troubles if you tried to run your
collection of chips at half speed.

If the round trip time is less than 1/Nth of your transition time,
then you can treat everything as a lumped capacitor.

Also remember that transmission lines go much slower if you add
lumped capacitors along them.


For things like this, I highly recommend:

  High-Speed Digital Design - A Handbook of Black Magic
  by Johnson and Graham

The examples are a bit out of date now, but the methods of thinking
about a problem are correct.  It's a very educational book - fun to
read and (generally) easy to understand.  It's the sort of book
that I can pick up to check something and get sucked into reading
another chapter or two just because I see an interesting graph abd
stop to check it out.  (like stopping to chat when you meet an old
friend)


A more modern version is:

  High-Speed Digital System Design
       A Handbook of Interconnect Theory and Design Practices
  by Hall, Hall, and McCall

I'm not as familiar with this.  It looks good, but doesn't seem
to be as much fun to read.  Lots of good/new stuff.


-- 
These are my opinions, not necessarily my employer's.  I hate spam.

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search