Messages from 30050

Article: 30050
Subject: Yet Another Newbie Question
From: "Speedy Zero Two" <dlatter@manorsNOSPAMway.freeserve.co.uk>
Date: Wed, 21 Mar 2001 21:04:13 -0000
Links: << >> << T >> << A >>

Hio All,

I'll try to keep this as simple as possible.

I am designing an ATE board which has a number (TBD) of programmable logic
devices.
The use of these devices will vary from project to project and must be
in-system-reprogrammable.

The logic will vary (but the IO pins remain the same) what is the best
device to ensure the maximum routability.

Cheers
Dave

Article: 30051
Subject: Re: TOA measurement
From: Ray Andraka <ray@andraka.com>
Date: Wed, 21 Mar 2001 21:21:31 GMT
Links: << >> << T >> << A >>

True the correlation method would probably work best, however in this particular
application I don't think it will work too well.  The pulse is of limited
duration, so the lengthof the correlation is limited by your sample rate.  For a
well designed pulse that will usually do alot for producing a nice sharp peak. 
For a rectangular or trapezoidal pulse that is long compared to the sample
interval, you do not have many distiguishing characteristics in the signal, so
the correlation is likely to be poor in the presence of noice and uncertain
pulse width and rise/fall times.

Juri Kanevski wrote:
> 
> The correlation method is probably the best solution.
> Even the signal can be sampled with poor amplitude resolution.
> After thousands of correlation averages you can
> derive the robust correlation peak.
> FPGA can implement the correlation with the sampling frequency ca 100
> MHz.
> When the correlation peak wave is then interpolated
> then its peak point can be estimated with the error up to ~1-2 ns.
> Really, when the radar impulse has more rich spectrum
> then the correlation peak will be stronger.
> 
> Ray Andraka wrote:
> >
> > You may be able to recover the pulse using a matched filter, but you are
> > probably going to have to digitize your data with more than one bit to get good
> > results with the noise level you indicate.  The output of the matched filter is
> > essentially a correlation of the signal against a model of the signal.  A
> > rectangular pulse is not a very good pulse shape for this though.
> >
> > Kolja Sulimma wrote:
> > >
> > > If I interpret that correctly the main problem is that the noise on the
> > > relativly slow rise time requires more sophisticated
> > > processing of the data then just computing the center of the pulse or doing
> > > something similar.
> > >
> > > I guess in that case I can not help you designing an algorithm, but when you
> > > find one, we can talk about the FPGA implementation.
> > > In general for DSP on small fixed point data FPGA is a good choice.
> > >
> > > CU,
> > >         Kolja
> > >
> > > Michal Kvasnicka wrote:
> > >
> > > > OK...
> > > > From my point of view delay measurement needs two delayed signal which are
> > > > compared in the TDE (time delay estimation) algorithm, but TOA measurement
> > > > work only with one precisely sampled signal and any available additional
> > > > apriory knowledge (pulse shape, noise model, etc.).
> > > >
> > > > Radar pulse can be approximated by trapezoidal (symmetric or asymmetric)
> > > > pulse wit the following parameters:
> > > > pulse width = 0.5 - 500us (50% amplitude level)
> > > > rise time = 20-100ns
> > > > fall time = 20-200ns
> > > > sample interval = 1 - 10ns
> > > > Pulse repetition interval = 1 - 5000us
> > > >
> > > > These pulses (pulse train) is contaminated by noise of the general form
> > > > (colored nongaussian, spread spectrum, etc.) with low S/N ration (in many
> > > > cases). Finally is represented by 1-channel data stream sampled by 1-10ns
> > > > and with precise time stamping.
> > > >
> > > > Time sampling is realized by Rubidium normal (short term stability about
> > > > 10^-12 ) connected with GPS time receiver for long term stability about
> > > > 10^-13 - 10^-15.
> > > >
> > > > Required TOA accuracy is about 1-10ns.
> > > >
> > > > So, I need effective and robust TOA algorithm which can be realized on DSP
> > > > or FPGA chip sufficiently fast (see PRI value ~ 1-5000us).
> > > >
> > > > Regards,
> > > >
> > > > Michal
> > > >
> > > > "Kolja Sulimma" <kolja@bnl.gov> píse v diskusním príspevku
> > > > news:3AB65BBE.2055F00D@bnl.gov...
> > > > > Can you describe the rwquirements in more detail?
> > > > > Is it just delay measurement with high resolution?
> > > > > What resolution do you need? And how many channels?
> > > > >
> > > > > CU,
> > > > >         Kolja Sulimma
> > > > >
> > > > > Michal Kvasnicka wrote:
> > > > >
> > > > > > Does anyone know of any texts or references concerning implementing TOA
> > > > > > (time of arrival) measurement of the radar signal on FPGA or DSP chips?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Michal
> > > > >
> >
> > --
> > -Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email ray@andraka.com
> > http://www.andraka.com

-- 
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com

Article: 30052
Subject: Re: Yet Another Newbie Question
From: Peter Alfke <peter.alfke@xilinx.com>
Date: Wed, 21 Mar 2001 13:23:53 -0800
Links: << >> << T >> << A >>

I suggest the largest Virtex-E device you can afford. It has plenty of routing,
and pin-locking is not an issue.
Virtex-II would be better, but is a very youg family, and the larger devices are
not yet available.
Altera devices, the only realistic alternative, are notorious for "pin-locking"
problems, i.e. difficulties when you have to change the logic but maintain the
pin-out. Which is what you want to do.

Peter Alfke, Xilinx Applications
=================================
Speedy Zero Two wrote:

> Hio All,
>
> I'll try to keep this as simple as possible.
>
> I am designing an ATE board which has a number (TBD) of programmable logic
> devices.
> The use of these devices will vary from project to project and must be
> in-system-reprogrammable.
>
> The logic will vary (but the IO pins remain the same) what is the best
> device to ensure the maximum routability.
>
> Cheers
> Dave

Article: 30053
Subject: Re: Senior I/O Designer - Canada
From: Andy Peters <"apeters <"@> noao [.] edu>
Date: Wed, 21 Mar 2001 16:31:36 -0700
Links: << >> << T >> << A >>

Lee Iovino wrote:
> 
> "Aggressive Meat" sounds like the name of a punk band from New Jersey.

No, they were from across the river in Brooklyn.  You're thinking of the
Meatmen, who were actually from Chicago.

-andy
former NJ punk rocker.

Article: 30054
Subject: reduced precision floating point
From: "Pete Dudley" <padudle@sandia.gov>
Date: Wed, 21 Mar 2001 17:30:32 -0700
Links: << >> << T >> << A >>

Has anyone tried implementing reduced precision floating point arithmetic in
an fpga?

With the new Virtex II 18 bit multipliers it occurred to us that we could do
floating point multiplies pretty efficiently if we reduced the mantissa size
from 23 bits to say 17 so that the mantissa multiply would fit in the hard
multipliers.

We want to miniaturize an algorithm that currently executs on a fabric of
Mercury PowerPC nodes in floating point of course.

--
Pete Dudley
Sandia Labs
Albuquerque, NM
padudle@sandia.gov

Article: 30055
Subject: Re: reduced precision floating point
From: Ray Andraka <ray@andraka.com>
Date: Thu, 22 Mar 2001 01:09:41 GMT
Links: << >> << T >> << A >>

The floating point multiplies aren't that expensive compared to a fixed point
multiply.  It's the floating point adders that kill you.

A floating point multiplier needs the fixed point multiplier plus an adder to
sum the incoming exponents, and incrementer, 2:1 word wide mux and leading zero
detect to renormalize the product (one bit shift is all that is ever needed) and
a second incrementer to round the result if desired.

A floating point add needs a subtracter to determine which addend has the bigger
exponent, a barrel shift on the input to denormalize the smaller addend, an
exchange network to steer the smaller addend to the leg with the barrel shift
(exchange network is cheaper than the barrel shift) in front of your fixed point
adder.  Then after the adder you need a second barrel shifter to renormalize the
sum, an additional adder to adjust the exponent for the renormalization and
encoding of the number of leading zeros (can get that from the barrel shift) to
generate the renormalize offset, plus an incrementer to round if desired.  The
barrel shifts complexity approaches that of the fixed point multiply.

Pete Dudley wrote:
> 
> Has anyone tried implementing reduced precision floating point arithmetic in
> an fpga?
> 
> With the new Virtex II 18 bit multipliers it occurred to us that we could do
> floating point multiplies pretty efficiently if we reduced the mantissa size
> from 23 bits to say 17 so that the mantissa multiply would fit in the hard
> multipliers.
> 
> We want to miniaturize an algorithm that currently executs on a fabric of
> Mercury PowerPC nodes in floating point of course.
> 
> --
> Pete Dudley
> Sandia Labs
> Albuquerque, NM
> padudle@sandia.gov

-- 
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com

Article: 30056
Subject: Re: reduced precision floating point
From: Peter Alfke <palfke@earthlink.net>
Date: Thu, 22 Mar 2001 04:07:02 GMT
Links: << >> << T >> << A >>

Ray Andraka wrote:

> <snip, lots of good stuff>  The
> barrel shifts complexity approaches that of the fixed point multiply.

But remember, the 18 x 18 multiplier is an excellent arithmetic shifter ( the way I
understand FP, you want arithmetic, not barrel shift)
All the multiplexer and routing complexity gets swallowed up in one fast
multiplier.
It is my gueass that more Virtex-II multipliers will be used as shifters than as
multipliers...
And every surplus multiplier ( same as every surplus BlockRAM) is free. What a
bargain!

Peter Alfke, Xilinx Applications

>
>

Article: 30057
Subject: Re: TOA measurement
From: Charles Lyttle <lyttlec@earthlink.net>
Date: Thu, 22 Mar 2001 04:40:45 GMT
Links: << >> << T >> << A >>

Michal Kvasnicka wrote:
> =

> Did you read my previous posts?
> =

> +>Time sampling is realized by Rubidium normal (short term stability ab=
out
> +>10^-12 ) connected with GPS time receiver for long term stability abo=
ut
> +>10^-13 - 10^-15.
> =

> TOA is measured as absolute time distributed in the network of the rece=
ivers
> with accuracy about 1ns. This time is not necessarily synchronized with=
 UTC,
> because I need only time difference (TDOA multilateration method for 3-=
D
> target location) TOA_1st station -TOA_2ns station =3D TDOA_1st2nd, etc.=

> =

> What now? Any suggestion from your side?
> =

> Regards, Michal
> =

> "Jerry Avins" <jya@ieee.org> p=EDse v diskusn=EDm pr=EDspevku
> news:3AB8F776.B4D79B0A@ieee.org...
> > Please satisfy my curiosity about what goes on here. If you know noth=
ing
> > about the transmitter, what instant do you measure the delay from?
> >
> > Jerry
> > --
> > Engineering is the art of making what you want from things you can ge=
t.
> > ---------------------------------------------------------------------=
--
> > Michal Kvasnicka wrote:
> > >
> > > Dear Juri,
> > >
> > > everything what did you write down is OK, but my problem is "passiv=
e"
> > > location (TDOA multilateration without sending whole pulse train to=
 the
> > > central station) not standard radar measurement. So, the transmitte=
d
> signal
> > > is, in general, unknown for me. I know only some apriori informatio=
n
> > > (regarding transmitted signal) which mostly play role of some bound=
aries
> > > regarding received signal shape.
> > >
> > > I am looking for algorithm which is able to define some unique refe=
rence
> > > point for each pulse and this reference point will be used for prec=
ise
> TOA
> > > measurement.
> > >
> > > If you could help me with this problem I will be really very happy.=

> > >
> > > Best regards, Michal
> > >
> >   ...
It looks like you will need to take a probabilistic approach. Try
looking for literature on Kalman filters in a feedback configuration.
You are trying to estimate the time at which the received true signal
exceeds a specified level. Traditionally, the threshold is taken as the
90% level. NASA has worked this problem many times, so there should be
some technical literature available from them. Most university libraries
can get or have NASA "TSP"s

-- =

Russ Lyttle
"World Domination through Penguin Power"
The Universal Automotive Testset Project at
<http://home.earthlink.net/~lyttlec>

Article: 30058
Subject: Is the carry logic for Virtex included in PAR timing report/check?
From: "Austin Franklin" <austin@dark99room.com>
Date: Thu, 22 Mar 2001 00:13:27 -0500
Links: << >> << T >> << A >>

I am curious if the carry logic is included in the PAR timing report...does
anyone know this for sure?  Version 3.2.05i.

Article: 30059
Subject: Re: DLL jitter "bake-off" vs. PLL
From: Rick Collins <spamgoeshere4@yahoo.com>
Date: Thu, 22 Mar 2001 02:06:40 -0500
Links: << >> << T >> << A >>

Interesting that you mention the IO. At a company where I have done
work, they needed a 622 MHz interface. The Xilinx solution using Virtex
required multiple inputs for the clock. I assume that the global clock
routing was not up to the task at that speed. With 16 data bits and 8
clock inputs, the resync problem was so large that they ended up using
the Altera parts. They used 16 data inputs and a single ~80 MHz clock
input. A PLL ran the clock back up to 622 MHz and the input FFs all ran
off the same clock. 

So neither system really did what the designer wanted. But the Altera
solution was the simplest at the time. 
 

Ray Andraka wrote:
> 
> Rick,
> 
> You've apparently discovered one of (what I consider to be) the major advantages
> to the xilinx architecture.  The SRL16/CLB ram capability of the Xilinx
> architectures has enormous potential for reducing the size of a design compared
> to any other FPGA out there.  I've also mentioned the limitations of the carry
> chain implementation in Altera families here before, as cmpared to the Xilinx
> carry structure.  From a silicon standpoint, the Xilinx architecture is superior
> for data path and signal processing designs.  As for the 622MHz I/O, be cautious
> of the claims of any vendor. There are plenty of caveats.  Also, even if the IO
> pins support it, make sure you understand the clocking structure as well, as the
> clock is often the limiting factor rather than the IOBs.
> 
> Rick Collins wrote:
> >
> > Falk Brunner wrote:
> > >
> > > Austin Lesea schrieb:
> > > >
> > > > For those interested:
> > > >
> > > >  http://www.xilinx.com/products/virtex/techtopic/vtt013.pdf
> > >
> > > ;-))) This is getting funny.
> > >
> > > > Comments are appreciated,
> > >
> > > Hmm, what should we expect?? That Altera says the Xilinx parts are
> > > better??
> > > And Xilinx says the Altera parts are better??
> > > Both "experiments" have their points, but they both have the smell of
> > > marketing and influenced by company policy.
> > > Its like the Pepsi and Coca fight . . .
> > > After all, both devices must prove their qualities in real world
> > > appllication, its alwas possible to bring a good device down on the
> > > knees with a heavy test (and vica versa ;-))
> > >
> > > --
> > > MFG
> > > Falk
> >
> > Yes, and this also ignores the many other issues involved in picking an
> > FPGA vendor. I am working with a company that does not commit to a
> > single vendor. They do their FPGA designs in HDL and do not use heavily
> > the proprietary features unless necessary. They then pick the chip for
> > the board at the final stage before building the prototype. This
> > maximizes their leverage and gets them the best price for their boards.
> >
> > Of course there are times that they have to pick one or the other based
> > on technical features. A new design with 10 Gbps fiber interface was
> > just not doable in a Xilinx part because of the high speed (622 MHz)
> > data path. The Altera part does this with a single clock. The Xilinx
> > solution was to use a clock for every two data pins. They would have
> > then needed fifos to resync the data to a common clock. The designers
> > felt this was not workable.
> >
> > I personally am more impressed with the Xilinx parts. I recently found
> > that the low cost ACEX parts from Altera (based on the 10K arch) does
> > not let you use the LUTs as RAM. I see this as a major drawback when you
> > need many small fifos.
> >
> > But again a non-technical issue of supply may force me to use the ACEX
> > instead of the Spartan II parts.

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 30060
Subject: Re: TBUFs in Virtex and later chips, going out of fashion, what instead
From: hmurray-nospam@megapathdsl.net (Hal Murray)
Date: Thu, 22 Mar 2001 07:33:00 -0000
Links: << >> << T >> << A >>

>As the chips scale, driving a 1" long wire became slower and slower (relative to
>the whole picture).  Recently we removed TBUF's and replaced them with mux's in
>a few thousands of designs, and without replacing and rerouting, the designs
>were all faster.  They also took up more area (the mux's).  By removing TBUF's
>we gain that area back (TBUF's have to be huge to drive the long wires), so that
>now are still more area efficient, and higher speed than before.  With PAR, the
>designs are all "better" than before.

I'm missing something.  Why are muxes better/faster?

I see why driving a 1" long wire is tough, but I don't see why
the driver after a mux is different from a TBUF.

In one case you have to turn the driver on.  In the other case
you have to get half way across the chip, then through a mux.
I'm assuming that getting the go signal to the tbuf is about
as hard as getting the select signals to a mux.


----

In either case, a critical parameter is how well the TBUF/mux
density lines up with a counter.  Assume I have a counter
that uses the "obvious" fast carry logic.  I need to get
that on the bus and to load other counters and registers
from that bus.

How many bits per CLB does that good counter use?

If I'm using TBUFs, I need that many TBUFs per CLB.

If I'm using muxes, I need enough routing to get all the
registers that drive the bus into the mux.  I haven't tried
to build anything like that.  It might be simpler if the mux
is distributed or some trick like that.

-- 
These are my opinions, not necessarily my employeers.  I hate spam.

Article: 30061
Subject: Re: Do I need to tie unused CPLD pins to GND?
From: kfalser@REMOVETHISdurst.it (Klaus Falser)
Date: Thu, 22 Mar 2001 07:41:58 GMT
Links: << >> << T >> << A >>

On Wed, 21 Mar 2001 10:19:32 -0800, Peter Alfke
<peter.alfke@xilinx.com> wrote:

>Some corrections, see below:
>
>Klaus Falser wrote:
>
>> As for EVERY chip, EVERY unused should be tied to a defined
>> and valid level, usually ground. This is for not damaging
>> the pins (and probabely the chip) if someone is touching
>> them accidentially (ESD).
>
>That's not the reason commonly given, but it helps, of course. Note that all (our) inputs and outputs are protected with two diodes, one to ground, the other one to Vcc or as a Zener diode,
>protecting against positive spikes.
>
>>
>> Another reason, but I don't know if it applies to Xilinx chips,
>> is that a invalid level on a input (and remember, this pins are
>> inputs and outputs simultaneously), may create a large shunt
>> current, since the upper and the lower transistor in the
>> input structure are on simultaneously.
>
>If the pin floats into the middle between ground and Vcc, then there is a certain current, at most a few milliamps per pin. Should be avoided, but is not cathastrophic.
>Xilinx pins have a default weak-pull-up resistor for that purpose (only).
>
>Gruß nach Südtirol!
>
>Peter Alfke, Xilinx Applications
>
>
>>
>>
>> I would suggest to connect all unused pins to ground and use
>> the above mentioned option too.
>> If you need to preserve some pins for future use, the
>> PROHIBIT option should work now.
>>
>> Hope this helps
>>         Klaus
>>

Thank you for your additional information.

In my environment, where every chip is mounted on a PCB
and a connection to GND is easily made, I can not see 
any reason to leave unused pins unconnected.

Greetings from the mountains (Yodelio...)

Falser Klaus
R&D Electronics Department
Company	: Durst Phototechnik AG
	Vittorio Veneto Str. 59
	I-39042 Brixen
Voice	: +0472/810235
	: +0472/810111
FAX	: +0472/830980
Email	: kfalser@IHATESPAMdurst.it

Article: 30062
Subject: Re: Spartan-II Evaluation Board
From: Alan Langman <alan@eng.uct.ac.za>
Date: Thu, 22 Mar 2001 11:09:16 +0200
Links: << >> << T >> << A >>

Hi

If you keen to make your own, we have a project on www.openh.org
based around the SpartanII-100K device. All the CAD files are in
the CVS repository. I should be adding a release with GERBER files
shortly.

Cheers

Alan

> Simon a écrit :
>
> > llandre wrote:
> > >
> > > > I'm looking for an evaluation board for Xlininx Spartan-II FPGAs. Does
> > > > any body know where to get a eval board that is not too expensive.
> ...
> > The best I've found is the BurchEd board (XCS200 for $120 US, works
> > with WebPack). Found no problems :-))
>
> an web address, please?
> --
> http://www.pascaland.org/ compilateurs, sources et liens langage pascal, delphi

Article: 30063
Subject: Re: TOA measurement
From: "Michal Kvasnicka" <m.kvasnicka@era.cz>
Date: Thu, 22 Mar 2001 10:36:48 +0100
Links: << >> << T >> << A >>

Hi Russ

thanks for interesting hint. Could you be so kind and suggest me some
relevant keywords regarding the NASA "TSP" resources?

Regards, Michal


"Charles Lyttle" <lyttlec@earthlink.net> píse v diskusním príspevku
news:3AC0AC1E.ED78312C@earthlink.net...
Michal Kvasnicka wrote:
>
> Did you read my previous posts?
>
> +>Time sampling is realized by Rubidium normal (short term stability about
> +>10^-12 ) connected with GPS time receiver for long term stability about
> +>10^-13 - 10^-15.
>
> TOA is measured as absolute time distributed in the network of the
receivers
> with accuracy about 1ns. This time is not necessarily synchronized with
UTC,
> because I need only time difference (TDOA multilateration method for 3-D
> target location) TOA_1st station -TOA_2ns station = TDOA_1st2nd, etc.
>
> What now? Any suggestion from your side?
>
> Regards, Michal
>
> "Jerry Avins" <jya@ieee.org> píse v diskusním príspevku
> news:3AB8F776.B4D79B0A@ieee.org...
> > Please satisfy my curiosity about what goes on here. If you know nothing
> > about the transmitter, what instant do you measure the delay from?
> >
> > Jerry
> > --
> > Engineering is the art of making what you want from things you can get.
> > -----------------------------------------------------------------------
> > Michal Kvasnicka wrote:
> > >
> > > Dear Juri,
> > >
> > > everything what did you write down is OK, but my problem is "passive"
> > > location (TDOA multilateration without sending whole pulse train to
the
> > > central station) not standard radar measurement. So, the transmitted
> signal
> > > is, in general, unknown for me. I know only some apriori information
> > > (regarding transmitted signal) which mostly play role of some
boundaries
> > > regarding received signal shape.
> > >
> > > I am looking for algorithm which is able to define some unique
reference
> > > point for each pulse and this reference point will be used for precise
> TOA
> > > measurement.
> > >
> > > If you could help me with this problem I will be really very happy.
> > >
> > > Best regards, Michal
> > >
> >   ...
It looks like you will need to take a probabilistic approach. Try
looking for literature on Kalman filters in a feedback configuration.
You are trying to estimate the time at which the received true signal
exceeds a specified level. Traditionally, the threshold is taken as the
90% level. NASA has worked this problem many times, so there should be
some technical literature available from them. Most university libraries
can get or have NASA "TSP"s

--
Russ Lyttle
"World Domination through Penguin Power"
The Universal Automotive Testset Project at
<http://home.earthlink.net/~lyttlec>

Article: 30064
Subject: Re: reduced precision floating point
From: Ray Andraka <ray@andraka.com>
Date: Thu, 22 Mar 2001 13:11:42 GMT
Links: << >> << T >> << A >>

Yup,

And you need two of them per adder, plus a leading zeros detect to set the shift
in the second one (if you use a merged tree barrel shift you get that leading
zero detect for very little extra logic, but with a multiplier you essentially
need a priority encoder.  still sounds pretty expensive compared to a fixed
point adder to me.  An xc2v10000 has 192 multiplier/block rams.  That's 96
floating point adds  out of 10 million gates.  Using an XCV40, that's only 2 FP
adders.

Peter Alfke wrote:
> 
> Ray Andraka wrote:
> 
> > <snip, lots of good stuff>  The
> > barrel shifts complexity approaches that of the fixed point multiply.
> 
> But remember, the 18 x 18 multiplier is an excellent arithmetic shifter ( the way I
> understand FP, you want arithmetic, not barrel shift)
> All the multiplexer and routing complexity gets swallowed up in one fast
> multiplier.
> It is my gueass that more Virtex-II multipliers will be used as shifters than as
> multipliers...
> And every surplus multiplier ( same as every surplus BlockRAM) is free. What a
> bargain!
> 
> Peter Alfke, Xilinx Applications
> 
> >
> >

-- 
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com

Article: 30065
Subject: Re: TOA measurement
From: Ray Andraka <ray@andraka.com>
Date: Thu, 22 Mar 2001 13:13:41 GMT
Links: << >> << T >> << A >>

I was pointing out that normally you'd use correlation, but in this case it will
not work very well because a)you don't have an accurate model of the pulse and
b) the pulse shape does not have very good correlation properties.

Michal Kvasnicka wrote:
> 
> You are talking about correlation, but in previous messages I said that the
> model pulse (asymmetric trapezoidal pulse with unknown pulse width and rise
> or fall time variations) is more or less uknown.
> 
> What kind of correlation you have in mind? Please, could you be so kind and
> describe your idea more detailed?
> 
> Thanks in advance,
> 
> Michal
> "Juri Kanevski" <kanevski@comsys.ntu-kpi.kiev.ua> pise v diskusnim prispevku
> news:3AB72D77.E53BFB9E@comsys.ntu-kpi.kiev.ua...
> > The correlation method is probably the best solution.
> > Even the signal can be sampled with poor amplitude resolution.
> > After thousands of correlation averages you can
> > derive the robust correlation peak.
> > FPGA can implement the correlation with the sampling frequency ca 100
> > MHz.
> > When the correlation peak wave is then interpolated
> > then its peak point can be estimated with the error up to ~1-2 ns.
> > Really, when the radar impulse has more rich spectrum
> > then the correlation peak will be stronger.
> >
> > Ray Andraka wrote:
> > >
> > > You may be able to recover the pulse using a matched filter, but you are
> > > probably going to have to digitize your data with more than one bit to
> get good
> > > results with the noise level you indicate.  The output of the matched
> filter is
> > > essentially a correlation of the signal against a model of the signal.
> A
> > > rectangular pulse is not a very good pulse shape for this though.
> > >
> > > Kolja Sulimma wrote:
> > > >
> > > > If I interpret that correctly the main problem is that the noise on
> the
> > > > relativly slow rise time requires more sophisticated
> > > > processing of the data then just computing the center of the pulse or
> doing
> > > > something similar.
> > > >
> > > > I guess in that case I can not help you designing an algorithm, but
> when you
> > > > find one, we can talk about the FPGA implementation.
> > > > In general for DSP on small fixed point data FPGA is a good choice.
> > > >
> > > > CU,
> > > >         Kolja
> > > >
> > > > Michal Kvasnicka wrote:
> > > >
> > > > > OK...
> > > > > From my point of view delay measurement needs two delayed signal
> which are
> > > > > compared in the TDE (time delay estimation) algorithm, but TOA
> measurement
> > > > > work only with one precisely sampled signal and any available
> additional
> > > > > apriory knowledge (pulse shape, noise model, etc.).
> > > > >
> > > > > Radar pulse can be approximated by trapezoidal (symmetric or
> asymmetric)
> > > > > pulse wit the following parameters:
> > > > > pulse width = 0.5 - 500us (50% amplitude level)
> > > > > rise time = 20-100ns
> > > > > fall time = 20-200ns
> > > > > sample interval = 1 - 10ns
> > > > > Pulse repetition interval = 1 - 5000us
> > > > >
> > > > > These pulses (pulse train) is contaminated by noise of the general
> form
> > > > > (colored nongaussian, spread spectrum, etc.) with low S/N ration (in
> many
> > > > > cases). Finally is represented by 1-channel data stream sampled by
> 1-10ns
> > > > > and with precise time stamping.
> > > > >
> > > > > Time sampling is realized by Rubidium normal (short term stability
> about
> > > > > 10^-12 ) connected with GPS time receiver for long term stability
> about
> > > > > 10^-13 - 10^-15.
> > > > >
> > > > > Required TOA accuracy is about 1-10ns.
> > > > >
> > > > > So, I need effective and robust TOA algorithm which can be realized
> on DSP
> > > > > or FPGA chip sufficiently fast (see PRI value ~ 1-5000us).
> > > > >
> > > > > Regards,
> > > > >
> > > > > Michal
> > > > >
> > > > > "Kolja Sulimma" <kolja@bnl.gov> píse v diskusním príspevku
> > > > > news:3AB65BBE.2055F00D@bnl.gov...
> > > > > > Can you describe the rwquirements in more detail?
> > > > > > Is it just delay measurement with high resolution?
> > > > > > What resolution do you need? And how many channels?
> > > > > >
> > > > > > CU,
> > > > > >         Kolja Sulimma
> > > > > >
> > > > > > Michal Kvasnicka wrote:
> > > > > >
> > > > > > > Does anyone know of any texts or references concerning
> implementing TOA
> > > > > > > (time of arrival) measurement of the radar signal on FPGA or DSP
> chips?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Michal
> > > > > >
> > >
> > > --
> > > -Ray Andraka, P.E.
> > > President, the Andraka Consulting Group, Inc.
> > > 401/884-7930     Fax 401/884-7950
> > > email ray@andraka.com
> > > http://www.andraka.com

-- 
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com

Article: 30066
Subject: Re: Is the carry logic for Virtex included in PAR timing report/check?
From: Ray Andraka <ray@andraka.com>
Date: Thu, 22 Mar 2001 13:16:59 GMT
Links: << >> << T >> << A >>

Yes, it is.  What Jan was complaining about is that if you do an add-mux in one
level of logic the mux action gates off the carry chain making the carry chain
timing a false path, which artificially makes the timing for the mux-direct
input much tighter than it really is.

Austin Franklin wrote:
> 
> I am curious if the carry logic is included in the PAR timing report...does
> anyone know this for sure?  Version 3.2.05i.

-- 
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com

Article: 30067
Subject: Re: TBUFs in Virtex and later chips, going out of fashion, what instead
From: Ray Andraka <ray@andraka.com>
Date: Thu, 22 Mar 2001 13:20:51 GMT
Links: << >> << T >> << A >>

If xilinx extended the horizontal or chains in V2 so that there was at least 4
per CLB, we'd be all set.

Hal Murray wrote:
> 
> >As the chips scale, driving a 1" long wire became slower and slower (relative to
> >the whole picture).  Recently we removed TBUF's and replaced them with mux's in
> >a few thousands of designs, and without replacing and rerouting, the designs
> >were all faster.  They also took up more area (the mux's).  By removing TBUF's
> >we gain that area back (TBUF's have to be huge to drive the long wires), so that
> >now are still more area efficient, and higher speed than before.  With PAR, the
> >designs are all "better" than before.
> 
> I'm missing something.  Why are muxes better/faster?
> 
> I see why driving a 1" long wire is tough, but I don't see why
> the driver after a mux is different from a TBUF.
> 
> In one case you have to turn the driver on.  In the other case
> you have to get half way across the chip, then through a mux.
> I'm assuming that getting the go signal to the tbuf is about
> as hard as getting the select signals to a mux.
> 
> ----
> 
> In either case, a critical parameter is how well the TBUF/mux
> density lines up with a counter.  Assume I have a counter
> that uses the "obvious" fast carry logic.  I need to get
> that on the bus and to load other counters and registers
> from that bus.
> 
> How many bits per CLB does that good counter use?
> 
> If I'm using TBUFs, I need that many TBUFs per CLB.
> 
> If I'm using muxes, I need enough routing to get all the
> registers that drive the bus into the mux.  I haven't tried
> to build anything like that.  It might be simpler if the mux
> is distributed or some trick like that.
> 
> --
> These are my opinions, not necessarily my employeers.  I hate spam.

-- 
-Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com  
http://www.andraka.com

Article: 30068
Subject: Re: reduced precision floating point
From: "Jan Gray" <jsgray@acm.org>
Date: Thu, 22 Mar 2001 06:38:00 -0800
Links: << >> << T >> << A >>

Once again, I'll mention the wacky idea of compact but slow FP add,
including denorm/renorm, using a bit or nybble serial representation,
possibly with SRLs: www.fpgacpu.org/usenet/fp.html. :-)  (If you need higher
throughput, build n of them.)

Jan Gray, Gray Research LLC

Article: 30069
Subject: Re: Is the carry logic for Virtex included in PAR timing report/check?
From: "Austin Franklin" <austin@dark99room.com>
Date: Thu, 22 Mar 2001 10:11:11 -0500
Links: << >> << T >> << A >>

Hi Ray,

My inquiry had nothing to do with Jan's issue.  I have a design that claims
it makes timing, yet has timing problems in the lab.  The only thing that
has changed in it is doing a 48 bit add vs a 24 bit add...and the data
coming from this adder seems to be the source of corruption.

When I use FPGA Editor to look at this path, it says the carry chain has 0
delay...for all 48 elements.  I believe the speedfile may be broken for the
XCV300, or the FPGA Editor is broken...but something is broken, unless the
carry chain really does take 0ns through 48 levels...

Austin

"Ray Andraka" <ray@andraka.com> wrote in message
news:3AB9FCA4.5E8C7C72@andraka.com...
> Yes, it is.  What Jan was complaining about is that if you do an add-mux
in one
> level of logic the mux action gates off the carry chain making the carry
chain
> timing a false path, which artificially makes the timing for the
mux-direct
> input much tighter than it really is.
>
> Austin Franklin wrote:
> >
> > I am curious if the carry logic is included in the PAR timing
report...does
> > anyone know this for sure?  Version 3.2.05i.
>
> --
> -Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com

Article: 30070
Subject: Nokia 8850 zu gewinnen 5859
From: comp@hotmail.com
Date: 22 Mar 2001 15:13:10 GMT
Links: << >> << T >> << A >>

Gewinnt ein Nokia 8850 auf http://www.shortlogo.de und holt euch Logos ab 1,- DM http://www.shortlogo.de
gibscztjzcfnivdmrfopgc

Article: 30071
Subject: Globals are plenty fast
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Thu, 22 Mar 2001 08:13:43 -0800
Links: << >> << T >> << A >>

Rick,

I have not come to bury PLL's, but to praise them.  In some applications, they are the
best way to solve the problem.  Just be sure they work with the rest of the system.

The 'problem' here is that Virtex and Virtex E has 4 and 8 respectively global clock
resources  so that in some cases where they want multiple banks of I/O's running at
622 Mb/s, they run out of clocks (and DLL's).  Altera has even fewer clock resources.

The Virtex II has 16 global clock resources.  From any clock input to any IO FF the
maximum skew is +/-75 ps.  The global clocks run quite nicely at 420 MHz (along with
the DCM's) for the 840 Mb/s SPI POS 4 application.

So, the assumption is incorrect.  The global clocks are quite adequate for the 311 MHz
required for DDR at 622 Mb/s in Virtex E.  There are many products  running at 155 MHz
in Virtex, and 311 MHz (622 Mb/s DDR) in Virtex E with one clock per 8 to 16 IO's.

At 622 MHz, I think you are well out of the datasheet specifications on the clock
resurces and interconnect for any FPGA out there.  It seems that for bit rates >500
Mb/s, or clock rates > 500 MHz, everything will be done by serializer/deserializers on
chip, or "specialized" logic (ie dedicated hardware interfaces) due to the signal
integrity nightmares, and the collision with trying to make a general IO do magic.

As an interesting aside, the input pins of anyone's package is typically series
resonant at ~650 MHz.  What this means is that there is no signal on the input pin,
yet the signal (may) makes it to the IOB (the magic of transmission lines).  Makes it
really hard to debug!

Austin

Rick Collins wrote:

> Interesting that you mention the IO. At a company where I have done
> work, they needed a 622 MHz interface. The Xilinx solution using Virtex
> required multiple inputs for the clock. I assume that the global clock
> routing was not up to the task at that speed. With 16 data bits and 8
> clock inputs, the resync problem was so large that they ended up using
> the Altera parts. They used 16 data inputs and a single ~80 MHz clock
> input. A PLL ran the clock back up to 622 MHz and the input FFs all ran
> off the same clock.
>
> So neither system really did what the designer wanted. But the Altera
> solution was the simplest at the time.
>
>
> Ray Andraka wrote:
> >
> > Rick,
> >
> > You've apparently discovered one of (what I consider to be) the major advantages
> > to the xilinx architecture.  The SRL16/CLB ram capability of the Xilinx
> > architectures has enormous potential for reducing the size of a design compared
> > to any other FPGA out there.  I've also mentioned the limitations of the carry
> > chain implementation in Altera families here before, as cmpared to the Xilinx
> > carry structure.  From a silicon standpoint, the Xilinx architecture is superior
> > for data path and signal processing designs.  As for the 622MHz I/O, be cautious
> > of the claims of any vendor. There are plenty of caveats.  Also, even if the IO
> > pins support it, make sure you understand the clocking structure as well, as the
> > clock is often the limiting factor rather than the IOBs.
> >
> > Rick Collins wrote:
> > >
> > > Falk Brunner wrote:
> > > >
> > > > Austin Lesea schrieb:
> > > > >
> > > > > For those interested:
> > > > >
> > > > >  http://www.xilinx.com/products/virtex/techtopic/vtt013.pdf
> > > >
> > > > ;-))) This is getting funny.
> > > >
> > > > > Comments are appreciated,
> > > >
> > > > Hmm, what should we expect?? That Altera says the Xilinx parts are
> > > > better??
> > > > And Xilinx says the Altera parts are better??
> > > > Both "experiments" have their points, but they both have the smell of
> > > > marketing and influenced by company policy.
> > > > Its like the Pepsi and Coca fight . . .
> > > > After all, both devices must prove their qualities in real world
> > > > appllication, its alwas possible to bring a good device down on the
> > > > knees with a heavy test (and vica versa ;-))
> > > >
> > > > --
> > > > MFG
> > > > Falk
> > >
> > > Yes, and this also ignores the many other issues involved in picking an
> > > FPGA vendor. I am working with a company that does not commit to a
> > > single vendor. They do their FPGA designs in HDL and do not use heavily
> > > the proprietary features unless necessary. They then pick the chip for
> > > the board at the final stage before building the prototype. This
> > > maximizes their leverage and gets them the best price for their boards.
> > >
> > > Of course there are times that they have to pick one or the other based
> > > on technical features. A new design with 10 Gbps fiber interface was
> > > just not doable in a Xilinx part because of the high speed (622 MHz)
> > > data path. The Altera part does this with a single clock. The Xilinx
> > > solution was to use a clock for every two data pins. They would have
> > > then needed fifos to resync the data to a common clock. The designers
> > > felt this was not workable.
> > >
> > > I personally am more impressed with the Xilinx parts. I recently found
> > > that the low cost ACEX parts from Altera (based on the 10K arch) does
> > > not let you use the LUTs as RAM. I see this as a major drawback when you
> > > need many small fifos.
> > >
> > > But again a non-technical issue of supply may force me to use the ACEX
> > > instead of the Spartan II parts.
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 30072
Subject: Re: TBUFs in Virtex and later chips, going out of fashion, what instead
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Thu, 22 Mar 2001 08:22:14 -0800
Links: << >> << T >> << A >>

Hal,

The mux's don't have to drive across the whole chip.  Maybe they need to drive to one
of the CLB's that may be reached by the directs, doubles (or hexes), making the
amount of logic that can be reached quickly, large.

This discussion reminds me of the days when people still argued that asssembly
language programming was "better" than any C compiler every could be.

Just like what happened to compiled languages, we are already there with compiled
HDL's: if it isn't fast enough after the best the tools can do, you go in and "code"
by hand the pieces that were not implemented with the best possible efficiency (if
there are any).

Or, you realize that you didn't take advantage of the parallelism, or some feature of
the part, and you recode your HDL.

With > 95% of all FPGA designs being done in HDL's, we think we are spending time and
silicon on what makes that flow the fastest and best possible.

Austin

Hal Murray wrote:

> >As the chips scale, driving a 1" long wire became slower and slower (relative to
> >the whole picture).  Recently we removed TBUF's and replaced them with mux's in
> >a few thousands of designs, and without replacing and rerouting, the designs
> >were all faster.  They also took up more area (the mux's).  By removing TBUF's
> >we gain that area back (TBUF's have to be huge to drive the long wires), so that
> >now are still more area efficient, and higher speed than before.  With PAR, the
> >designs are all "better" than before.
>
> I'm missing something.  Why are muxes better/faster?
>
> I see why driving a 1" long wire is tough, but I don't see why
> the driver after a mux is different from a TBUF.
>
> In one case you have to turn the driver on.  In the other case
> you have to get half way across the chip, then through a mux.
> I'm assuming that getting the go signal to the tbuf is about
> as hard as getting the select signals to a mux.
>
> ----
>
> In either case, a critical parameter is how well the TBUF/mux
> density lines up with a counter.  Assume I have a counter
> that uses the "obvious" fast carry logic.  I need to get
> that on the bus and to load other counters and registers
> from that bus.
>
> How many bits per CLB does that good counter use?
>
> If I'm using TBUFs, I need that many TBUFs per CLB.
>
> If I'm using muxes, I need enough routing to get all the
> registers that drive the bus into the mux.  I haven't tried
> to build anything like that.  It might be simpler if the mux
> is distributed or some trick like that.
>
> --
> These are my opinions, not necessarily my employeers.  I hate spam.

Article: 30073
Subject: Re: TBUFs in Virtex and later chips, going out of fashion, what instead
From: Austin Lesea <austin.lesea@xilinx.com>
Date: Thu, 22 Mar 2001 08:24:41 -0800
Links: << >> << T >> << A >>

Ray,

Horizontal Cascade Carry.  Virtex II has so many new features, some tend to get lost.

Austin

Austin

Ray Andraka wrote:

> If xilinx extended the horizontal or chains in V2 so that there was at least 4
> per CLB, we'd be all set.
>
> Hal Murray wrote:
> >
> > >As the chips scale, driving a 1" long wire became slower and slower (relative to
> > >the whole picture).  Recently we removed TBUF's and replaced them with mux's in
> > >a few thousands of designs, and without replacing and rerouting, the designs
> > >were all faster.  They also took up more area (the mux's).  By removing TBUF's
> > >we gain that area back (TBUF's have to be huge to drive the long wires), so that
> > >now are still more area efficient, and higher speed than before.  With PAR, the
> > >designs are all "better" than before.
> >
> > I'm missing something.  Why are muxes better/faster?
> >
> > I see why driving a 1" long wire is tough, but I don't see why
> > the driver after a mux is different from a TBUF.
> >
> > In one case you have to turn the driver on.  In the other case
> > you have to get half way across the chip, then through a mux.
> > I'm assuming that getting the go signal to the tbuf is about
> > as hard as getting the select signals to a mux.
> >
> > ----
> >
> > In either case, a critical parameter is how well the TBUF/mux
> > density lines up with a counter.  Assume I have a counter
> > that uses the "obvious" fast carry logic.  I need to get
> > that on the bus and to load other counters and registers
> > from that bus.
> >
> > How many bits per CLB does that good counter use?
> >
> > If I'm using TBUFs, I need that many TBUFs per CLB.
> >
> > If I'm using muxes, I need enough routing to get all the
> > registers that drive the bus into the mux.  I haven't tried
> > to build anything like that.  It might be simpler if the mux
> > is distributed or some trick like that.
> >
> > --
> > These are my opinions, not necessarily my employeers.  I hate spam.
>
> --
> -Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com

Article: 30074
Subject: Re: TOA measurement
From: Jerry Avins <jya@ieee.org>
Date: Thu, 22 Mar 2001 13:00:41 -0500
Links: << >> << T >> << A >>

Michal Kvasnicka wrote:
> 
> Did you read my previous posts?

I thought so. Either I missed something or I had forgotten.
> 
 ...
> 
> TOA is measured as absolute time distributed in the
 network of the receivers
 ^^^^^^^ ^^ ^^^ ^^^^^^^^^   That's what I missed!
> with accuracy about 1ns. This time is not necessarily synchronized with UTC ...

> What now? Any suggestion from your side?
> 
> Regards, Michal
> 
  ...

Is there some way that your receivers can communicate with one another
at high bandwidth? If so, correlations could be performed between the
same pulse received in different places. That would overcome the
variability of the pulse shape. It's probably not practical. Oh, well!

Jerry
-- 
Engineering is the art of making what you want from things you can get.
-----------------------------------------------------------------------

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search