Messages from 60375

Article: 60375
Subject: Paging Peter Alfke (3S1000 pricing)
From: "Pete Fraser" <pete@rgb.com>
Date: Thu, 11 Sep 2003 13:43:32 -0700
Links: << >> << T >> << A >>

You mentioned in a recent thread that the 3S1000 would sell
for $20 in CY2004 in the slowest speed grade and large
quantities.

I was recently quoted $85.65 for XC3S1000-4FG676C
in 5000s for CY2004. Is there really such a huge difference
between 5000 piece prices and "large quantities"?

I ended up going with an off-the-shelf solution as being more
cost-effective, but a $35 ish price might have made for
a different decision.

Article: 60376
Subject: Re: Metatstable Modeling
From: Jim Granville <jim.granville@designtools.co.nz>
Date: Fri, 12 Sep 2003 09:05:44 +1200
Links: << >> << T >> << A >>

Peter Alfke wrote:
> 
> I have a new idea how to simplify the metstable explanation and calculation.
> Following Albert Einstein's advice that everything should be made as
> simple as possible, but not any simpler:

Quite agree.

> 
> We all agree that the extra metastable delay occurs when the data input
> changes in a tiny timing window relative to the clock edge. We also
> agree that the metastable delay is a strong function of how exactly the
> data transition hits the center of that window.
> That means, we can define the width of the window as a function of the
> expected metastable delay.
> 
> Measurements on Virtex-IIPro flip-flops showed that the metastable
> window is:
> 
> • 0.07 femtoseconds for a delay of 1.5 ns.
> • The window gets a million times smaller for every additional 0.5 ns of delay.
> 
> Every CMOS flip-flop will behave similarily. The manufacturer just has
> to give you the two parameters ( x femtoseconds at a specified delay,
> and y times smaller per ns of additional delay)
> 
> The rest is simple math, and it even applies to Jim's question of
> non-asynchronous data inputs.  I like this simple formula because it
> directly describes the actual physical behavior of the flip-flop, and
> gives the user all the information for any specific systems-oriented
> statistical calculations.

 eg: Take a system that is not randomly async, but by some quirk of
nature, actually has two crystal sources, one for clock, and another 
for the data. These crystals are quite stable, but have a slow 
relative phase drift due to their 0.5ppm mismatch.

 Now lets say I want to know not just the statistical average, but to
get
some idea of the peak - the real failure mode is not 'white noise', but
has distinct failure peaks near 'phase lock', and nulls clear of this.
 Seems management wants to know how bad it can get, for how long,
not just 'how good it is, on average', so we'll humour them :)

 That's a "specific systems-oriented statistical calculation".
Please demonstrate how to apply the above x & y, to give me 
all the information I seek.

-jg

Article: 60377
Subject: Re: Paging Peter Alfke (3S1000 pricing)
From: Peter Alfke <peter@xilinx.com>
Date: Thu, 11 Sep 2003 14:16:58 -0700
Links: << >> << T >> << A >>

Pete, I have forwarded your e-mail to our marketing group. Let's see
what they say.
I do not have to explain to you that late 2004, slowest speed grade,
plus high volume are the parameters that get you the lowest price. Let's
hear it from Marketing...
Peter
=================
Pete Fraser wrote:
> 
> You mentioned in a recent thread that the 3S1000 would sell
> for $20 in CY2004 in the slowest speed grade and large
> quantities.
> 
> I was recently quoted $85.65 for XC3S1000-4FG676C
> in 5000s for CY2004. Is there really such a huge difference
> between 5000 piece prices and "large quantities"?
> 
> I ended up going with an off-the-shelf solution as being more
> cost-effective, but a $35 ish price might have made for
> a different decision.

Article: 60378
Subject: Re: Reading and processing input from graphics cards (DVI)?
From: lishu99@yahoo.com (Lis Hu)
Date: 11 Sep 2003 14:21:45 -0700
Links: << >> << T >> << A >>

You would need at least DVI receivers, transmitters, and the FPGA.
Unless you can synchronize the two output streams, you would need
memory for buffering, so that you can "line up" the pixels.  If the PCs
are outputting different resolutions or at different refresh frequencies,
you would need a scaler/converter.


murkspi@amuro.net (mur KSpi) wrote in message news:<e0e6dba0.0309110536.18a61b2b@posting.google.com>...
> Is the following achievable?
> 
> - Two PCs render graphics and output it to their graphics card DVI
> outputs
> - A FPGA based board reads these two data streams (maybe using Silicon
> Image receivers?) and processes the data (basically a comparison of
> pixel values)
> - The processed data is output via DVI. This output could be used as
> an input to another FPGA board and so forth..
> 
> What components (FPGAs, DVI receivers, transmitters) would one need? 
> Thanks for all answers..
> 
>   Andre

Article: 60379
Subject: Re: Xilinx-gdb Sources publicly available?
From: John Williams <jwilliams@itee.uq.edu.au>
Date: Fri, 12 Sep 2003 07:46:55 +1000
Links: << >> << T >> << A >>

Hi Mario,

Mario Trams wrote:
> Dear fellows,
> 
> according 
> http://www.xilinx.com/ise/embedded/gdb_debugger.htm
> Xilinx has slightly modified the gdb used for debugging of the 
> VirtexIIPro PPC405.
> 
> Doeas anybody know whether there are Sources or at least 
> patches of/for this gdb laying around somewhere?
> I couldn't find anything so far and I guess that they are 
> not publicly available.

They're there, but for some reason Xilinx makes them very hard to find:

http://www.xilinx.com/guest_resources/gnu/index.htm

> Actually, I would like to use the debugger under native 
> Linux with ddd frontend rather than inside a VMware with 
> Windows/Cygwin as I'm doing it right now.

The source package at the address I just gave has the entire toolchain 
source for the EDK, microblaze and PPC, binutils, gcc, gdb and so on.

We've successfully rebuilt binutils and gdb under linux native, so it 
definitely can be done.

Regards,

John

Article: 60380
Subject: Re: Xilinx-gdb Sources publicly available?
From: John Williams <jwilliams@itee.uq.edu.au>
Date: Fri, 12 Sep 2003 07:48:59 +1000
Links: << >> << T >> << A >>

I wrote:

> The source package at the address I just gave has the entire toolchain 
> source for the EDK, microblaze and PPC, binutils, gcc, gdb and so on.

Let me clarify: the package contains sources for gcc, gdb and binutils, 
for both Microblaze and PPC.

John

Article: 60381
Subject: Re: DDR in EDK 3.2sp2...
From: John Williams <jwilliams@itee.uq.edu.au>
Date: Fri, 12 Sep 2003 07:51:53 +1000
Links: << >> << T >> << A >>

Hi Terry,

Terry Andersen wrote:
> I have a MB1000 board from Insight, I use the reference design available
> "VII_MicroBlaze_DDR_Reference_Design". It works ok, but as soon as I ad an
> interrupt controller (opb_intc) and make the timer go thorugh the interrupt
> controller and interrupt the Microblaze I cant read anything from the
> DDR-RAM!!! The timer runs fine though.....All I read from the DDR_RAM is
> zeros :-(

Can you post your MHS file?

> Anyone has an idea of what is wrong? Someone has tried similar?

Go to the microblaze uclinux web site (see below) and download the 
"mbvanilla_ddr" hardware project (under Downloads).  There you can find 
a large microblaze project that has several uarts and a timer (all 
driving an INTC), as well as SRAM and DDR controller.  It works out of 
the box for the v2mb1000 board (constraints and everything) - you should 
be able to extract what you need to get your project going.

http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux

Regards,

John

Article: 60382
Subject: Re: Metatstable Modeling
From: Peter Alfke <peter@xilinx.com>
Date: Thu, 11 Sep 2003 15:08:07 -0700
Links: << >> << T >> << A >>

Interesting.
Let's say we have two frequencies, 100 MHz even, and 100.000 050 MHz,
which is 50 Hz higher. These two frequencies will beat or wander over
each other 50 times per second.
Assuming no noise and no jitter, each step will be 10 ns divided by 2
million = 5 femtoseconds. That is 80 times wider than the capture window
for a 1.5 ns delay. Therefore we can treat this case the same way as my
original case with totally asynchronous frequencies. I think even jitter
has no bearing on this, because it also would be far, far wider that the
capture window.  That means, this slowly drifting case is not special at
all, except that metastable events would be spaced by multiples of 20 ms
(1/50 Hz) apart. But that's irrelevant for events that occur on average
once per year or millenium. 

Now, you will never ever, under any circumstances, get a guarantee not
to exceed a long delay, since by accident the flip-flop might go
perfectly metastable and stay for a long time. It's just an extremely
small probability, expressed as a very, very long MTBF. That is the
fundamental nature of metastability.

To repeat, I like the capture window approach because it is independent
of data rate and clock rate.
Greetings, and thanks for the discussion. It helped me clear up my mind...

Peter Alfke
=================================
Jim Granville wrote:
> 
> Peter Alfke wrote:
> >
> > I have a new idea how to simplify the metstable explanation and calculation.
> > Following Albert Einstein's advice that everything should be made as
> > simple as possible, but not any simpler:
> 
> Quite agree.
> 
> >
> > We all agree that the extra metastable delay occurs when the data input
> > changes in a tiny timing window relative to the clock edge. We also
> > agree that the metastable delay is a strong function of how exactly the
> > data transition hits the center of that window.
> > That means, we can define the width of the window as a function of the
> > expected metastable delay.
> >
> > Measurements on Virtex-IIPro flip-flops showed that the metastable
> > window is:
> >
> > • 0.07 femtoseconds for a delay of 1.5 ns.
> > • The window gets a million times smaller for every additional 0.5 ns of delay.
> >
> > Every CMOS flip-flop will behave similarily. The manufacturer just has
> > to give you the two parameters ( x femtoseconds at a specified delay,
> > and y times smaller per ns of additional delay)
> >
> > The rest is simple math, and it even applies to Jim's question of
> > non-asynchronous data inputs.  I like this simple formula because it
> > directly describes the actual physical behavior of the flip-flop, and
> > gives the user all the information for any specific systems-oriented
> > statistical calculations.
> 
>  eg: Take a system that is not randomly async, but by some quirk of
> nature, actually has two crystal sources, one for clock, and another
> for the data. These crystals are quite stable, but have a slow
> relative phase drift due to their 0.5ppm mismatch.
> 
>  Now lets say I want to know not just the statistical average, but to
> get
> some idea of the peak - the real failure mode is not 'white noise', but
> has distinct failure peaks near 'phase lock', and nulls clear of this.
>  Seems management wants to know how bad it can get, for how long,
> not just 'how good it is, on average', so we'll humour them :)
> 
>  That's a "specific systems-oriented statistical calculation".
> Please demonstrate how to apply the above x & y, to give me
> all the information I seek.
> 
> -jg

Article: 60383
Subject: Re: Xilinx-gdb Sources publicly available?
From: "Kenneth Land" <kland1@neuralog1.com>
Date: Thu, 11 Sep 2003 17:40:54 -0500
Links: << >> << T >> << A >>


"emanuel stiebler" <emu@ecubics.com> wrote in message
news:3f60e67a$0$28899$afc38c87@...
> <snip>
> You can ask the same question again on the microblaze forum on xilinx
><snip>

Wow, microblaze has a forum?!?  I sure wish Altera would take a que and do
the same for Nios.

A quick perusal of the Xilinx forums shows a lot of discussion of the type I
would like to have about the Nios.  Not really important enough to inject
into this general group (comp.arch.fpga) but of interest to other Nios
users.

Sorry to interrupt the thread.  On Altera's site you can find the link to
the gnupro sources by searching the knowledge base.

Ken

Article: 60384
Subject: Xilinx 6.1i on Red Hat 9
From: Peter Monta <pmonta@www.pmonta.com>
Date: Thu, 11 Sep 2003 22:49:02 GMT
Links: << >> << T >> << A >>

It seems to be possible to run the Xilinx 6.1i tools on Red Hat 9;
I though I'd post the details to save people Googling for the
answer.

For the install,

bash# cd /mnt/cdrom
bash# LD_ASSUME_KERNEL=2.4.1 ./setup

for each of the two CD-ROMs.  There is a warning message
early on about some U/Win (sp?) library, which I ignored.
Assuming one chooses /opt/xilinx for the install location,
the following lines would go into .bashrc:

export XILINX=/opt/xilinx
export PATH=/opt/xilinx/bin/lin:$PATH
export LD_LIBRARY_PATH=/opt/xilinx/bin/lin:$LD_LIBRARY_PATH

The command-line tools (xst, par, etc.) seem not to require
the LD_ASSUME_KERNEL magic, but GUI tools like the FPGA editor
do require it.

I have no deep knowledge of what this environment variable
does to glibc, only that it works for me, so far.  I suppose
Xilinx may or may not "support" this.

Cheers,
Peter Monta

Article: 60385
Subject: Re: What clock domain is a Xilinx DCM LOCK signal in?
From: johnp3+nospam@probo.com (John Providenza)
Date: 11 Sep 2003 16:34:16 -0700
Links: << >> << T >> << A >>

I understand about the phase align capabilities provided by the DCM,
but what clock domain are the LOCKED and STATUS bits created in?

I want to feed LOCKED into a state machine, but I don't see in the
Xilinx docs anyplace which clock produces it. 

John Providenza


"Steven K. Knapp" <steve.knappNO#SPAM@xilinx.com> wrote in message news:<bjobmb$oab1@cliff.xsj.xilinx.com>...
> You can phase align feedback using either the CLK0 or CLK2X DCM outputs.
> 
> There is a relatively new application note on DCMs in Spartan-3 that may be
> useful to your question.
> 
> XAPP462:  Using Digital Clock Managers (DCMs) in Spartan-3 FPGAs
> http://www.xilinx.com/bvdocs/appnotes/xapp462.pdf
> 
> Phase alignment is optional for the Digital Frequency Synthesizer function
> in a DCM.
> ---------------------------------
> Steven K. Knapp
> Applications Manager, Xilinx Inc.
> Spartan-3/II/IIE FPGAs
> http://www.xilinx.com/spartan3
> ---------------------------------
> Spartan-3:  Make it Your ASIC
> 
> 
> "John Providenza" <johnp3+nospam@probo.com> wrote in message
> news:349ef8f4.0309101349.2cbe675a@posting.google.com...
> > I don't see in the Xilinx documentation in what clock domain
> > the LOCK signal coming from a DCM is produced.  Do I need to
> > synchronize it into the CLK0 domain to avoid metastability?
> >
> > Thanks!
> >
> >
> > John Providenza

Article: 60386
Subject: Re: CMOS camera w/ USB2 -- crazy?
From: "lichau" <lichau@starband.net>
Date: Thu, 11 Sep 2003 23:38:04 GMT
Links: << >> << T >> << A >>

We have a CMOS (1.3 mpixel) camera based on TI TMS320DM642 DSP (4000 MIPS)
with USB 2.0 connectivity.  Changing it to your sensor should be
straightforward, since the sensor is on its own board.

Contact me at rich@XXapollo-image.com; take out the "XX" and I will give you
the password to our web site.

"GB" <donotspam_grantbt@jps.net> wrote in message
news:fNH6b.2542$PE6.510@newsread3.news.pas.earthlink.net...
> Hi,
>
> I'm a firmware guy pulled into a project well out of my area of
> expertise.  My boss wants to build (essentially) a digital camera
> using an image sensor chip (1600x1200) and output it's data
> "as fast as possible" using USB2.0.  His initial concept, being
> that I'm a firmware guy, was to use a "really fast micro."
> Normally the company is involved with small 8-bitter micro
> projects, so you can see I'm well out of my normal bounds.
>
> Now this seems like a pretty big stretch to me... and I don't see
> how it can be done without the speed of hardware (the image
> chips all seem to have clock speeds in the tens of MHz and the
> USB2 transfer rate is 480Mbps (theor.).  Do aspects of this
> project require an FPGA to keep the data "as fast as possible?"
> If we use and FPGA for camera-to-RAM and then use a
>  micro for the USB2 part, what kind of fast micros can
> move data at that rate?
>
> Also, this is something that I am sure we will have to contract
> out, so if you have any past experience with this, please let
> me know your thoughts (and if you are available).
>
> Thanks!
>
>

Article: 60387
Subject: Re: CMOS camera w/ USB2 -- crazy?
From: Andrew Paule <lsboogy@qwest.net>
Date: Thu, 11 Sep 2003 19:25:09 -0500
Links: << >> << T >> << A >>

Should be good - burst rate is listed at 480, but fast wide SCSI had 
burst rates like this - what is the sustained transfer rate - still not 
really documented.

http://www.cypress.com/cfuploads/img/products/CY7C68013.pdf  is a good 
chip for doing this, I think.

Tell me how it turns out

Andrew

Neil Franklin wrote:

>Andrew Paule <lsboogy@qwest.net> writes:
>
>  
>
>>USB (even at 12Mbs is too slow
>>for this stuff  ( a 1Mpixel at 8 bit will require 2/3 sec + overhead to
>>empty one frame) - tell you boss it sounds cool, but you need 1394
>>(firewire) or SCSI to do this worth a DS.
>>    
>>
>
>According to the Subject: line he is intending to use USB2.
>
>That is 480MBit/s, according to http://www.usb.org/faq/ans2#q1 .
>
>Should be fast enough.
>
>
>--
>Neil Franklin, neil@franklin.ch.remove http://neil.franklin.ch/
>Hacker, Unix Guru, El Eng HTL/BSc, Programmer, Archer, Blacksmith
>- hardware runs the world, software controls the hardware
>  code generates the software, have you coded today?
>  
>

Article: 60388
Subject: Re: CMOS camera w/ USB2 -- crazy?
From: james <buckanear7@yahoo.com>
Date: Fri, 12 Sep 2003 00:29:27 GMT
Links: << >> << T >> << A >>

Andrew 

No problem. 

james



On Thu, 11 Sep 2003 11:51:09 -0500, Andrew Paule <lsboogy@qwest.net>
wrote:

>Sorry - no offense meant.
>
>Andrew
>
>james wrote:
>
>>This is not my project. It's GB's project.
>>
>>Still you can not build hardware around an unknown sensor. You have to
>>pick an imager and then build the hardware/firmware around it. 
>>
>>james
>>
>>
>>On Wed, 10 Sep 2003 20:57:32 -0500, Andrew Paule <lsboogy@qwest.net>
>>wrote:
>>
>>  
>>
>>>Use the micro to set up the packets in an FGPA/ASIC under isocronous 
>>>control, and stream them out from there, if you can deal with the low 
>>>data/frame rates - I used to build large format (4 x 5 and hasseblad) 
>>>camera digital inserts using big CCD's - USB (even at 12Mbs is too slow 
>>>for this stuff  ( a 1Mpixel at 8 bit will require 2/3 sec + overhead to 
>>>empty one frame) - tell you boss it sounds cool, but you need 1394 
>>>(firewire) or SCSI to do this worth a DS.  If you simply want to capture 
>>>one frame and empty it - consider dumping it to RAM and then out  from 
>>>there.  Leaving stuff on a sensor - CCD or CMOS (yes CCD's are CMOS, 
>>>both n and p  type are built)  results in large dark currents that make 
>>>them unusable.  At 2/3 second, well depth will be a large consideration 
>>>here - electrons like to move around.
>>>
>>>Andrew
>>>
>>>james wrote:
>>>
>>>    
>>>
>>>>On Sun, 07 Sep 2003 15:03:39 GMT, "GB" <donotspam_grantbt@jps.net>
>>>>wrote:
>>>>
>>>> 
>>>>
>>>>      
>>>>
>>>>>Hi,
>>>>>
>>>>>I'm a firmware guy pulled into a project well out of my area of
>>>>>expertise.  My boss wants to build (essentially) a digital camera
>>>>>using an image sensor chip (1600x1200) and output it's data
>>>>>"as fast as possible" using USB2.0.  His initial concept, being
>>>>>that I'm a firmware guy, was to use a "really fast micro."
>>>>>Normally the company is involved with small 8-bitter micro
>>>>>projects, so you can see I'm well out of my normal bounds.
>>>>>
>>>>>Now this seems like a pretty big stretch to me... and I don't see
>>>>>how it can be done without the speed of hardware (the image
>>>>>chips all seem to have clock speeds in the tens of MHz and the
>>>>>USB2 transfer rate is 480Mbps (theor.).  Do aspects of this
>>>>>project require an FPGA to keep the data "as fast as possible?"
>>>>>If we use and FPGA for camera-to-RAM and then use a
>>>>>micro for the USB2 part, what kind of fast micros can
>>>>>move data at that rate?
>>>>>
>>>>>Also, this is something that I am sure we will have to contract
>>>>>out, so if you have any past experience with this, please let
>>>>>me know your thoughts (and if you are available).
>>>>>
>>>>>Thanks!
>>>>>
>>>>>   
>>>>>
>>>>>        
>>>>>
>>>>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>1600x 1200 is essentially a 2 megapixel camera!
>>>>
>>>>1) First step is to determine what the camera is going to be used for!
>>>>
>>>>Terrestial or astronomical or video photography  
>>>>
>>>>2) Pick an imager! Either Sony, TI, Kodak, or Panasonic to name a few.
>>>>
>>>>3) From the Imager specs you can derive how fast the data can be
>>>>clocked out of the imager. Most imagers will transfer the image area
>>>>into a serial register one line at a time. How fast this is depends on
>>>>how fast you can clock the serial register. Transfer speeds differ
>>>>      
>>>>
>>>>from vendor to vendor. 
>>>    
>>>
>>>>4) Then build the circuitry around the imager based on its ability to
>>>>transfer the full image as fast as you want and that meet your cost
>>>>goals.  
>>>>
>>>>Again depending on what you determine as reasonably fast will effect
>>>>the cost of the imager along with its size. Another consideration will
>>>>be the speed of the ADC. That can slow things down also. Even if you
>>>>can clock the serial register of the imager out at 20Mhz rate, if the
>>>>ADC sasmple rate is 10 MHz  that is as fast as you can get the pixel
>>>>data out. 
>>>>
>>>>IF your imager's max clock frequency for the serial register is 20
>>>>MHz., you can clock a 1600 pixel row out in about 80 uS. Or the whole
>>>>image area out of the chip in about 100 mS.  So your microC or FPGA
>>>>will have to read the ADC once every 50 nS or so during the readout
>>>>period. 
>>>>
>>>>There are some CPU cores as well as USB cores that can drop into an
>>>>FPGA. You can build a large FIFO or add onboard flash to store the
>>>>picture. 
>>>>
>>>>It is not crazy at all in fact it is quite doable. The two key items
>>>>in a digital camera is the imager and the ADC. All the rest is digital
>>>>hardware that is well suited for an ASIC or FPGA.  
>>>>
>>>>
>>>>james
>>>>
>>>> 
>>>>
>>>>      
>>>>
>>
>>  
>>

Article: 60389
Subject: Re: Xilinx-gdb Sources publicly available?
From: John Williams <jwilliams@itee.uq.edu.au>
Date: Fri, 12 Sep 2003 10:48:12 +1000
Links: << >> << T >> << A >>

Kenneth Land wrote:
> "emanuel stiebler" <emu@ecubics.com> wrote in message
> news:3f60e67a$0$28899$afc38c87@...
> 
>><snip>
>>You can ask the same question again on the microblaze forum on xilinx
>><snip>
> 
> 
> Wow, microblaze has a forum?!?  I sure wish Altera would take a que and do
> the same for Nios.

Yes, except that web-based fora are, almost without exception, clumsy 
and painful to use.

> A quick perusal of the Xilinx forums shows a lot of discussion of the type I
> would like to have about the Nios.  Not really important enough to inject
> into this general group (comp.arch.fpga) but of interest to other Nios
> users.

Every now and then I ponder the prospect of refloating the 
comp.arch.reconfigurable RFC....  the last time this was raised there 
was some vigourous discussion (google has it all), but the idea 
eventually died.

Regards,

John

Article: 60390
Subject: Re: Webpack Vs. ISE
From: "Clyde R. Shappee" <clydes@the_world.com>
Date: Thu, 11 Sep 2003 20:53:02 -0400
Links: << >> << T >> << A >>

I have used Webpack 4.2 something and I believe the ISE foundation of about the
same edition.

I have found that they are essentially identical, with the following
differences:

Webpack does not have the core generator.  You cannot use the block rams and
other on chip resources.

I also am not sure that you can change the synthesis flow from XST to any thing
else.  I don't know this for fact, as I have not tried to do this.

HTH

Clyde

Dave wrote:

> I am just about to go through a 115 page introduction tutorial on the XCESS
> website for using the Xilinx Webpack 4.x edition. However I will be using
> the ISE Foundation 4.x edition and want to know if I am wasting my time
> reading the entire Webpack tutorial to learn how to use the ISE Foundation
> edition. I am assuming its all the same, with Webpack just having less
> features. Anyone who is familiar with both editions that can let me know to
> go ahead with this or STOP - and find a tutorial at Xilinx instead (I need
> to install the software for their tutes I think) would be much appreciated.
> Initial stages will be purely schematic entry. VHDL will come later.
>
> Regards
> Dave

Article: 60391
Subject: Re: Compact FIR filters with multiplier blocks?
From: hsneoh@netscape.net (Hong Shan Neoh)
Date: 11 Sep 2003 17:56:11 -0700
Links: << >> << T >> << A >>

Ken,
While the RSG solution may yield smaller designs for specific cases,
the Altera FIR Compiler gives you more flexibility in terms of
optimizing area vs.speed.

For instance, the numbers presented in the RSG datasheet is based on a
pipeline=2 setting for the Altera FIR Compiler.  Using the FIR
Compiler, the design yields an fmax of 322MHz (single rate, single
channel). This is much higher than the 154MHz cited for the filter
using the RSG approach.  This is the classic speed/area trade-off
scenario.

If indeed area is the critical factor, it is possible to reduce the
pipeline to 1 in the FIR Compiler.  In the single rate cases, the
logic cell count comparison would show that the RSG approach would be
beneficial for the single and 2 channel FIR designs (58% and 80%
respectively).  In the 4 and 8 channel FIR designs, the distributed
arithmetic approach employed by the Altera FIR Compiler yields better
area compared to the RSG generated filter (106% and 133%
respectively).  Reducing the number of pipeline stage to 1 yields fmax
of 237MHz (single rate, single channel), still well beyond the
performance requirement in most cases.  This, I believe, is a more
accurate
comparison for the RSG datasheet.

Regards,
HS

Tero Rissa <tpr@doc.ic.ac.uk> wrote in message news:<bjhr0m$7f7$1@harrier.doc.ic.ac.uk>...
> I went around the irregularity issue by having sub-multiplier 
> block architecture that has have fixed interface to the routing 
> and have fixed (yet reasonable) area. Therefore, when the 
> coefficients are changed, no place and route is required and 
> the latency remains the same (unless you change the number taps). 
> The generation of coefficients can done at reconfiguration time 
> thanks to symmetry in the FPGA used (Atmel 40K40). Naturally, 
> there is the problem of hassling with run-time reconfiguration 
> and everything that comes with that...
> 
> As part of this work we looked also into common subexpression 
> sharing in that particular FPGA family and found it very unlikely 
> that benefits could be obtained with similar multiplier-block 
> architecture. This is mainly due the fact that it is different 
> story to be able to generate the most useful common subexpressions 
> that it is to really use them before the routing becomes congested.
> 
> http://www.doc.ic.ac.uk/~tpr/papers/rissa_FPT02.pdf
> 
> T.Rissa
> 
> 
> Ray Andraka <ray@andraka.com> wrote:
> > I agree the multiplier block style filters are more efficient area-wise.  It
> > sounds like you have addressed the irregularity issues by using a program
> > to do the generation, which I think is pretty much a necessity.  As I thought
> > I alluded to, the biggest problem with multiplier block filters is that the
> > layout/size is not a constant if you change the coefficients.  This means that
> > the fiter coefficients have to be constant and known earlier in the design
> > cycle, and necessitates a rerun of synthesis, place and route for any filter
> > changes.  Depending on the implementation, it may also mean a change in the
> > filter's pipeline latency.  These factors can make them difficult to use on
> > some projects.  The filters typically used in my projects often need to be
> > adjusted by the customer or late in the project to accommodate minor
> > requirements changes.  I prefer to use a filter with reloadable coefficients
> > for that reason.
> 
> 
> 
> > Ken wrote:
>  
> >> Ray,
> >>
> >> I sent this to Michael via email and he suggested the group would be
> >> interested also...
> >>
> >> My PhD (now drawing to the end) has been on implementing full-parallel
> >> Transpose FIR filters using multiplier blocks that you mention (I use
> >> techniques/algorithms that exceed the efficiency of CSD in terms of FPGA
> >> area).
> >>
> >> The upshot of my work is that I have written a C++ program that will
> >> generate RTL VHDL given the quantised filter coefficients, the type of
> >> filter required (singlerate, interpolation, decimation etc.) and the
> >> appropriate parameters (input width, signed/unsigned input, number of
> >> channels, rate-change factor etc.)
> >>
> >> The VHDL my program generates exceeds the functionality (at a lower
> >> cost) of that provided by Xilinx's Distributed Arithmetic core and Altera's
> >> FIR Compiler (also DA).  In fact, my program allows interpolation and
> >> decimation factors up to the number of filter coefficients and any number of
> >> data channels (for interpolation/decimation filters also).
> >>
> >> The main point is that, once synthesised and mapped to a specific FPGA, the
> >> filters my program generates require far less FPGA area (slices/logic cells)
> >> than those generated using Distributed Arithmetic.  The critical path in my
> >> filters is just the longest adder carry chain so very high speeds are
> >> possible.  E.g. 154MHz for a singlerate filter (25 bit output) in a Xilinx
> >> xc2v3000-fg676-5 - obviously the speed will depend on the device
> >> family/speed grade and the longest carry chain.  The facility for multiple
> >> channels in interpolation/decimation filters (not supported by Xilinx)
> >> allows lower than full-parallel sampling rates to be efficiently processed
> >> in one filter.
> >>
> >> As Michael points out in his post, this technique would be very suitable for
> >> a
> >> Xilinx Spartan-IIE and indeed any FPGA - there are many cases where these
> >> filters would be useful even on devices with dedicated multipliers (when
> >> they are all in use for example!  ;-)   ).
> >>
> >> You can find out more at http://www.dspec.org/rsg.asp - there are also
> >> datasheets here that provide comparisons with Xilinx and Altera and
> >> demonstrate the output of another application (written in java) that
> >> generates schematic representations of the filters for use in reports,
> >> meetings and thesises!  :-)
> >>
> >> I hope this information is of use to you - please contact me if you have any
> >> questions,
> >>
> >> Thanks for your time,
> >>
> >> Ken
> >>
> >> --
> >> To reply by email, please remove the _MENOWANTSPAM from my email address.
> >>
> >> "Ray Andraka" <ray@andraka.com> wrote in message
> >> news:3F54F936.5E694FD1@andraka.com...
> >> > The problem with the multiplier block approach is that the
> >> > construction is predicated on the specific coefficients.  As
> >> > a result it is considerably harder to use for an arbitrary
> >> > set of coefficients.  It may reduce area over a straight FIR
> >> > filter running at the same clocks per sample, but at a
> >> > considerable cost in design time and flexibility.  You also
> >> > give up regularity in the structure, which may reduce the
> >> > overall performance.   Essentially what the block multiplier
> >> > and distributed arithmetic approaches are is a rearrangement
> >> > of the bitwise product terms.  The mutliplier block takes
> >> > advantage of duplicate terms by adding the inputs before
> >> > they are multiplied by the term.
> >> >
> >> > Michael Spencer wrote:
> >> >
> >> > > Hello,
> >> > >
> >> > > Has anyone compared FPGA implementations of full-rate
> >> > > digital FIR filters based on the use of Multiplier Blocks
> >> > > vs. traditional FIRs with constant coefficient
> >> > > multipliers? By full rate, I mean: one output result per
> >> > > clock cycle and no interpolation or decimation.
> >> > >
> >> > > For anyone not familiar, a multiplier block is a network
> >> > > of shifters and adders that performs multiplications by
> >> > > several coefficients efficiently by exploiting common
> >> > > sub-expressions. The multiplier block can be exploited in
> >> > > FIR filters by transposing the standard filter so that the
> >> > > products of all the coefficients with the current
> >> > > input-sample are required simultaneously.
> >> > >
> >> > > Also, by representing the coefficients in the
> >> > > Canonical-Signed-Digit number system (a small number of
> >> > > +1 and -1's) along common sub-expression sharing the
> >> > > multiplier block can get even smaller.
> >> > >
> >> > > For example, the multiplier block for a 100 tap FIR filter
> >> > > (fp=0.10 and fs=0.12) can be realized with only 61 adds
> >> > > (zero explicit multiplications). See filter example #4 in
> >> > > "FIR Filter Synthesis Algorithms for Minimizing the Delay
> >> > > and the Number of Adders,"
> >> > > http://ics.kaist.ac.kr/~dk/papers/TCAD2001.pdf
> >> > > If the adder depth is constrained to a maximum of four,
> >> > > then the authors' algorithm can do the multiplier block in
> >> > > 69 additions.
> >> > >
> >> > > It would seem that this approach would be very efficient
> >> > > in a target such as the Xilinx Spartan-IIE (with no
> >> > > dedicated multipliers).
> >> > >
> >> > > Another question: If we only need one result per K clock
> >> > > periods (K ~= 1000 for audio applications), could a
> >> > > multiplier block approach realized with, say, bit-serial
> >> > > addition be more efficient than some other approach such
> >> > > as distributed arithmetic?
> >> > >
> >> > > Comments welcome. Thanks.
> >> > >
> >> > > -Michael
> >> > > ______________________
> >> > > Michael E. Spencer, Ph.D.
> >> > > President
> >> > > Signal Processing Solutions, Inc.
> >> > > Web: http://www.spsolutions.com
> >> >
> >> > --
> >> > --Ray Andraka, P.E.
> >> > President, the Andraka Consulting Group, Inc.
> >> > 401/884-7930     Fax 401/884-7950
> >> > email ray@andraka.com
> >> > http://www.andraka.com
> >> >
> >> >  "They that give up essential liberty to obtain a little
> >> >   temporary safety deserve neither liberty nor safety."
> >> >                                           -Benjamin
> >> > Franklin, 1759
> >> >
> >> >
>  
> > --
> > --Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email ray@andraka.com
> > http://www.andraka.com
>  
> >  "They that give up essential liberty to obtain a little
> >   temporary safety deserve neither liberty nor safety."
> >                                           -Benjamin Franklin, 1759

Article: 60392
Subject: Re: Time Killing Post P&R Simulation
From: Stephen Williams <spamtrap@icarus.com>
Date: Thu, 11 Sep 2003 18:19:21 -0700
Links: << >> << T >> << A >>

Bob Perlman wrote:

>>  There are some recent developments in EDA tools like Mentor
>>Graphics' VStation which cater to problems like I am facing by
>>"actually" simulating on the target hardware. But they are toooooooooo
>>costly (I don't know what makes EDA tool companies to fix such a high
>>price for their products)
>>  Is there any alternate way of simulating my design ?
> 
> 
> What's your motivation for doing post-route simulation?  Why not just
> simulate your pre-synthesis code, and use post-route static timing
> analysis to confirm the timing?

We've started doing post route simulation of a design we are
working on, and we are finding that xst is being "surprising".
It's been not matching pre-synthesis simulation. We've been
finding bugs in the synthesis results. Yikes!

-- 
Steve Williams                "The woods are lovely, dark and deep.
steve at icarus.com           But I have promises to keep,
http://www.icarus.com         and lines to code before I sleep,
http://www.picturel.com       And lines to code before I sleep."

Article: 60393
Subject: Re: Xilinx 6.1i on Red Hat 9
From: Stephen Williams <spamtrap@icarus.com>
Date: Thu, 11 Sep 2003 18:20:50 -0700
Links: << >> << T >> << A >>

Peter Monta wrote:
> It seems to be possible to run the Xilinx 6.1i tools on Red Hat 9;
> I though I'd post the details to save people Googling for the
> answer.
> 
>

OK, How 'bout SuSE 8? Or better yet, SLES8 for AMD64?
-- 
Steve Williams                "The woods are lovely, dark and deep.
steve at icarus.com           But I have promises to keep,
http://www.icarus.com         and lines to code before I sleep,
http://www.picturel.com       And lines to code before I sleep."

Article: 60394
Subject: Re: pipelined divider
From: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu>
Date: Fri, 12 Sep 2003 02:50:34 GMT
Links: << >> << T >> << A >>


"ykagarwal" <yog_aga@yahoo.co.in> wrote in message
news:4d05e2c6.0309110200.71793e02@posting.google.com...

(snip)

> fine, thanks i cud find the book (bit old edition probably)
> here but there is no detail abt pipelined divider as such ..
> anyway if somebody comes across the thing may suggest.
> and xilinx probably shud give a sequential version at least for
> larger width
> (i've made it anyway)

The references for the 360/91 are to the IBM Research and Development
Journal, I believe Vol. 11.,
January 1967.

-- glen

Article: 60395
Subject: Re: Crystal Input to FPGA
From: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu>
Date: Fri, 12 Sep 2003 02:53:15 GMT
Links: << >> << T >> << A >>


"rickman" <spamgoeshere4@yahoo.com> wrote in message
news:3F5FD89D.D7B92959@yahoo.com...
> Glen Herrmannsfeldt wrote:
> >

(snip about a crystal oscillator for use with FPGA's)

> > The oscillator that I used to know used three CMOS inverting gates in
series
> > with the crystal wrapped around them.  Possibly some resistors, too.
> > Usually one more gate to buffer and shape the result.
> >
> > Though I remember people having a hard time sometimes with the 32kHz
> > crystals, it worked well for everything else.
>
> But what is the advantage over an oscillator unless you are trying to
> squeze every penny out of the design?  The difference between an
> oscillator and a crystal is less than $.50.

Some people just don't like them, but otherwise I agree.

-- glen

Article: 60396
Subject: Re: Paging Peter Alfke (3S1000 pricing)
From: mrand@my-deja.com (Marc Randolph)
Date: 11 Sep 2003 20:39:19 -0700
Links: << >> << T >> << A >>

"Pete Fraser" <pete@rgb.com> wrote in message news:<vm1nindpbsac54@news.supernews.com>...
> You mentioned in a recent thread that the 3S1000 would sell
> for $20 in CY2004 in the slowest speed grade and large
> quantities.
> 
> I was recently quoted $85.65 for XC3S1000-4FG676C
> in 5000s for CY2004. Is there really such a huge difference
> between 5000 piece prices and "large quantities"?
> 

Howdy Pete,

Believe me, you are not anywhere near "large quantities."  Xilinx uses
that phrase when they put useless prices in their press releases - it
most often refers to 250k pieces a year, so you are only off by 50x. 
See
http://www.xilinx.com/prs_rls/silicon_spart/0333spartan3.htm or almost
any other of their recent press releases for examples.

Have fun,

   Marc

Article: 60397
Subject: Re: Metatstable Modeling
From: rickman <spamgoeshere4@yahoo.com>
Date: Fri, 12 Sep 2003 00:42:32 -0400
Links: << >> << T >> << A >>

Peter Alfke wrote:
> 
> Interesting.
> Let's say we have two frequencies, 100 MHz even, and 100.000 050 MHz,
> which is 50 Hz higher. These two frequencies will beat or wander over
> each other 50 times per second.
> Assuming no noise and no jitter, each step will be 10 ns divided by 2
> million = 5 femtoseconds. That is 80 times wider than the capture window
> for a 1.5 ns delay. Therefore we can treat this case the same way as my
> original case with totally asynchronous frequencies. I think even jitter
> has no bearing on this, because it also would be far, far wider that the
> capture window.  That means, this slowly drifting case is not special at
> all, except that metastable events would be spaced by multiples of 20 ms
> (1/50 Hz) apart. But that's irrelevant for events that occur on average
> once per year or millenium.
> 
> Now, you will never ever, under any circumstances, get a guarantee not
> to exceed a long delay, since by accident the flip-flop might go
> perfectly metastable and stay for a long time. It's just an extremely
> small probability, expressed as a very, very long MTBF. That is the
> fundamental nature of metastability.
> 
> To repeat, I like the capture window approach because it is independent
> of data rate and clock rate.
> Greetings, and thanks for the discussion. It helped me clear up my mind...

I don't want to beat a dead horse, but I do want to make clear that the
capture window model does not eliminate the frequency of the clock and
data from the failure rate calculation.  The basic probability of a
failure from any single event is clearly explained by the window model,
but to get a failure rate you need to know the clock rates to know how
often the the possible event is tested, so to speak.  If you double
either the clock or the data rate, you double the failure rate.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 60398
Subject: Re: Time Killing Post P&R Simulation
From: rickman <spamgoeshere4@yahoo.com>
Date: Fri, 12 Sep 2003 00:51:59 -0400
Links: << >> << T >> << A >>

Stephen Williams wrote:
> 
> Bob Perlman wrote:
> 
> >>  There are some recent developments in EDA tools like Mentor
> >>Graphics' VStation which cater to problems like I am facing by
> >>"actually" simulating on the target hardware. But they are toooooooooo
> >>costly (I don't know what makes EDA tool companies to fix such a high
> >>price for their products)
> >>  Is there any alternate way of simulating my design ?
> >
> >
> > What's your motivation for doing post-route simulation?  Why not just
> > simulate your pre-synthesis code, and use post-route static timing
> > analysis to confirm the timing?
> 
> We've started doing post route simulation of a design we are
> working on, and we are finding that xst is being "surprising".
> It's been not matching pre-synthesis simulation. We've been
> finding bugs in the synthesis results. Yikes!

Can you provide any specifics?  What sorts of errors are you finding? 
Is the synthesis not right?  Are circuits being "optimized" incorrectly
by the P&R software?  Timing issues?  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 60399
Subject: Re: Crystal Input to FPGA
From: rickman <spamgoeshere4@yahoo.com>
Date: Fri, 12 Sep 2003 00:57:31 -0400
Links: << >> << T >> << A >>

Glen Herrmannsfeldt wrote:
> 
> "rickman" <spamgoeshere4@yahoo.com> wrote in message
> news:3F5FD89D.D7B92959@yahoo.com...
> > Glen Herrmannsfeldt wrote:
> > >
> 
> (snip about a crystal oscillator for use with FPGA's)
> 
> > > The oscillator that I used to know used three CMOS inverting gates in
> series
> > > with the crystal wrapped around them.  Possibly some resistors, too.
> > > Usually one more gate to buffer and shape the result.
> > >
> > > Though I remember people having a hard time sometimes with the 32kHz
> > > crystals, it worked well for everything else.
> >
> > But what is the advantage over an oscillator unless you are trying to
> > squeze every penny out of the design?  The difference between an
> > oscillator and a crystal is less than $.50.
> 
> Some people just don't like them, but otherwise I agree.

I wouldn't know what there is not to like.  An oscillator unit is
smaller, simpler and works better than a crystal circuit you can design
in just a few hours without extensive testing.  If the cost difference
is not an issue (such as production volumes below 10,000) I can't see
how it would pay to design your own oscillator.  Even with higher volume
production, I bet the lower failure rate would make a self design not
worth the effort.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search