Messages from 59975

Article: 59975
Subject: Re: FPGA/DSP Expert - business partner for innovative FFT
From: kim.seung@sbcglobal.net (Seung)
Date: 2 Sep 2003 20:42:56 -0700
Links: << >> << T >> << A >>

Hi

I am sorry that I am not an expert on FPGA. So the bench mark numbers
are not available that would show actual computation time with actual
FPGA clock speed,lay out efficiencies, etc.

My innovation is more of an algorithmic level - so I would rather
compare
number of clock cycles and gate counts in abstract sense.
I know that highly optimized radix-4, -8 -16 (even 32) modules are out
there,
from which one could build fairly fast 1-D FFTs of moderate sizes
using
mux-parallel scheme if necessary. I would say these are implementation
optimized schemes.

My approach has the best speed/HW complexity ratio in its basic form.
The improvement mainly comes from simplicity of control and 
highly regular structure due to algorithmic advances.

For 256 point FFT, it requires 256 clock cycles w/o parallel
implementation.
Precision/accuracy is not on my mind yet: I'm assuming that it is a
common trade-off. The delta improvement might be small for small size
FFTs,
but as FFT size increases, it becomes more significant.

Also 2-D,3-D,... M-D FFT can be done with the same
efficiency, i.e., NxN 2-D FFT requires N^2 cycles without the
complexity of
transpose HW, control nor associated latencies. This is a true
innovation
for multi-dimensional signal processing. Remember that there has been
no true
multi-dimensional FFT algorithm. My algorithm is the first 'intrinsic'
M-D FFT
not relying on 1-D FFT. This innovation will make multi-dimension
signal processing much more affordable.  For example, high resolution
hand-held
2-D/3-D ultrasound imaging device will be possible. Also
machine/computer vision
application will benefit greatly.

Sorry that I didn't provide direct answers to your questions, but I
hope
my answer will provide better understanding.

Regards,

Seung P. Kim

Ray Andraka <ray@andraka.com> wrote in message news:<3F550D7F.3DC1122B@andraka.com>...
> I'd be curious to hear about a benchmark or two:
> what is the transform time for a given size FFT, say 256 points?
> what is the FPGA utilization for that FFT?
> Precision/accuracy?
> I have an FFT kernel specifically designed for FPGAs that also has nothing
> idle during the FFT processing.  It also does not use the usual C-T
> butterfly, and has only local data flow.  The data sheet for the 16 point
> kernel is posted on my website.
> 
> Seung wrote:
> 
> > Hello
> >
> > I have a patent and recently added one more on innovative FFT
> > algorithm and architecture.
> > If you're a business minded expert on FPGA with interests in DSP, this
> > is a great opportunity. Our FFT is 'the' optimal HW solution as
> > follows:
> >
> > 1. Minimum HW complexity: 100% HW utilization
> > 2. Suitable for super fast pipelined FFT: only local data flow - not
> > based on butterfly algorithm
> > 3. Minimum clock cycles: baseline architecture needs N clock for
> > N-point FFT
> > 4. Scalable to arbitrary large FFT size
> > 5. Multi-dimension extension: world's first 'intrinsic'
> > multi-dimensional FFT algorithm & architecture (not relay on 1-D FFTs)
> > : great for 2-D/3-D real-time medical imaging, SAR, etc.
> >
> > If you're interested in building a business together based on this
> > innovation,
> > please contact me with your resume. It'll be ideal if you have
> > contacts for potential customers.
> >
> > Any help on this matter from FPGA/DSP group members will be
> > appreciated.
> >
> > Thanks.
> >
> > Seung P. Kim, Ph.D
> > Silicon Computing, Inc.
> > Mountain View, CA
> 
> --
> --Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com
> 
>  "They that give up essential liberty to obtain a little
>   temporary safety deserve neither liberty nor safety."
>                                           -Benjamin Franklin, 1759

Article: 59976
Subject: Re: Thinking out loud about metastability
From: Ray Andraka <ray@andraka.com>
Date: Wed, 03 Sep 2003 00:14:06 -0400
Links: << >> << T >> << A >>

Noise is not needed, but it may help to accelerate departure from the
balance point shortening the metastable period.  On the other hand, noise
can also work the other way pushing the Q point closer to the balance point
thereby delaying recovery.  As a result, noise has no net effect on the
metastable probabilities.

rickman wrote:

> Bob Perlman wrote:
> >
> > Whoever said, "If you require noise to shift you out of metastability,
> > then the people who argue that more noise will get you out quicker
> > could then be right," could you explain further?  Are you saying that
> > noise is required to resolve the metastable state, or is this a
> > counter-argument to the "noise may get you out faster" claim?  Or is
> > it something else entirely?
>
> I guess this was not well stated.  This was in response to someone else
> who seemed to be saying that noise is needed to get out of
> metastability.  Just before this I believe I spoke about the perfect
> balance point being so small that it was not significant.  So noise is
> not really needed.  The quote above was to say that if noise really is
> required, then the advocates of more noise may be right.  But my point
> is that noise is not needed since there is virtually never a "perfect"
> balance.
>
> --
>
> Rick "rickman" Collins
>
> rick.collins@XYarius.com
> Ignore the reply address. To email me use the above address with the XY
> removed.
>
> Arius - A Signal Processing Solutions Company
> Specializing in DSP and FPGA design      URL http://www.arius.com
> 4 King Ave                               301-682-7772 Voice
> Frederick, MD 21701-3110                 301-682-7666 FAX

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 59977
Subject: Generating Asynchronous FIFO in Block Memory of Sparatn-II in CoreGen
From: atif@kics.edu.pk (Atif)
Date: 2 Sep 2003 21:30:20 -0700
Links: << >> << T >> << A >>

Hello,
Please answer my following questions.

Q1. At Xilinx site I've read that Coregen is not available with ISE
Webpack. But I've got a requested trial CD of ISE WebPack contains the
Coregen.
So please tell me about this conflict. Is the coregen available with
ISE foundation contain additional features and functionalities and
that of webpack limited one?

Q.2.I am generating 32x1 Asynchronous FIFO in Block Memory of
Sparatn-II in CoreGen of my ISE web pack5 (trial version).
But in the data Port Parameters I am unable to give the FIFO depth of
1. Rather the minimum depth available for this FIFO is 15 irrespective
of FIFO width. Please tell me how can I generate a 32X1 Asynchronous
FIFO in coregen5? Can I do this directly? If, no, then can I do this
by generating 32x15 FIFO and only use the first depth (depth1) and not
use the depth 2-15?

Thanks and Regards
Atif

Article: 59978
Subject: Re: Thinking out loud about metastability
From: "Glen Herrmannsfeldt" <gah@ugcs.caltech.edu>
Date: Wed, 03 Sep 2003 04:58:21 GMT
Links: << >> << T >> << A >>


"rickman" <spamgoeshere4@yahoo.com> wrote in message
news:3F550C7E.6A451965@yahoo.com...

(snip)

> I don't see how this is possible.  You are assuming that the CPU has
> some way to measure that a FF output is metastable.  I don't know of any
> way of doing that.  How is this circuit designed?

There are books, and probably web sites explaining self timed logic, or
asynchronous logic.

There are no clocks, but it takes more wires for each signal.   Designing is
completely different than synchronous logic designs, which means that the
current design tools don't help very much.   I don't know it well enough to
explain here, though.

-- glen

Article: 59979
Subject: Re: [ann] Microblaze uClinux Demo released
From: Dan Kegel <dank-news@kegel.com>
Date: Wed, 03 Sep 2003 06:14:18 GMT
Links: << >> << T >> << A >>

John Williams wrote:
> We have ported the uClinux operating system to the Microblaze soft 
> processor core, developed by Xilinx for their FPGA family.  The demo 
> provides an easy-to-use package that demonstrates the progress and 
> potential of uClinux running on Microblaze.  The uClinux kernel is 
> released under the GNU GPL.

That's pretty cool.  I see on your page
http://www.itee.uq.edu.au/~jwilliams/mblaze-uclinux/
a link to gcc/newlib build instructions, and I gather
the source for your modified gcc is at http://www.xilinx.com/guest_resources/gnu/
However, I see no mention of Microblaze
anywhere in the gcc mailing list archives.
Is Xilinx planning on contributing their Microblaze port
back to the GCC project, i.e. have they executed a copyright
assignment so upcoming versions of gcc can supprt Microblaze out-of-the-box?

I ask partly because I'm wondering if it's worth folding Microblaze
support into my generic toolchain build script
(http://kegel.com/crosstool) or PTXDist.  This would be somewhat difficult
Microblaze used a special hacked version of gcc...
- Dan

Article: 59980
Subject: Re: Measuring metastability.
From: rickman <spamgoeshere4@yahoo.com>
Date: Wed, 03 Sep 2003 03:30:06 -0400
Links: << >> << T >> << A >>

Symon wrote:
> 
> Hi Austin,
>        Maybe I got the wrong end of the stick, but when Peter said:-
> "I have never seen strange levels or oscillations ( well, 25 years ago
> we had TTL oscillations). Metastability just affects the delay on the
> Q output." I thought he meant that he'd only seen metastability where
> the output from the FF was always either on or off, just that
> sometimes the transition was delayed. Philip's pictures clearly show
> 'strange levels'. This is important, I believe, when deciding what the
> effects of metastable FFs are on following circuitry. I guess we'll
> have to wait until he returns from his Portugese jaunt before we find
> out what he meant!!

I think the exact behavior is largely irrelevant since a simple delay is
just as disasterous as anything else you would encounter.  Since you
don't know *when* the transition would happen, it could happen at the
moment the next FF is latching the intermediate value.  That is enough
for the next FF and all following logic to behave badly as well.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 59981
Subject: Re: Thinking out loud about metastability
From: rickman <spamgoeshere4@yahoo.com>
Date: Wed, 03 Sep 2003 03:37:35 -0400
Links: << >> << T >> << A >>

Glen Herrmannsfeldt wrote:
> 
> "rickman" <spamgoeshere4@yahoo.com> wrote in message
> news:3F550C7E.6A451965@yahoo.com...
> 
> (snip)
> 
> > I don't see how this is possible.  You are assuming that the CPU has
> > some way to measure that a FF output is metastable.  I don't know of any
> > way of doing that.  How is this circuit designed?
> 
> There are books, and probably web sites explaining self timed logic, or
> asynchronous logic.
> 
> There are no clocks, but it takes more wires for each signal.   Designing is
> completely different than synchronous logic designs, which means that the
> current design tools don't help very much.   I don't know it well enough to
> explain here, though.

Yes, everything I have read (which is not a lot) simply relies on the
control path (the clock) to have a longer delay than the data path.  So
the data will always reach the next stage before the control saying the
data is ready.  

There is no way to determine when a circuit is metastable or not.  Async
circuits are not magic, they just depend on predictable delays, just
like any other circuit.  The design is different because there is no
common clock so each circuit can run with its own delay.  Since the next
circuit will take the output when it is ready, there is no problem with
synchronization.  

One problem I do see is that if each stage has a different delay, then
it can not accept a new input from the preceding stage until the output
has been taken by the following stage.  It seems in the end the async
circuit will run no faster than the slowest stage, which is what the
sync clocked circuit will do.  

But I have not read about it much, so perhaps there is more to it than
just this.  But without design tools or any expectation of using it
soon, there is not much incentive to spend much time reading up on it
now.  

-- 

Rick "rickman" Collins

rick.collins@XYarius.com
Ignore the reply address. To email me use the above address with the XY
removed.

Arius - A Signal Processing Solutions Company
Specializing in DSP and FPGA design      URL http://www.arius.com
4 King Ave                               301-682-7772 Voice
Frederick, MD 21701-3110                 301-682-7666 FAX

Article: 59982
Subject: OT: Block diagramming tools?
From: Jay <se10110@yahoo.com>
Date: Wed, 3 Sep 2003 02:40:33 -0500
Links: << >> << T >> << A >>

Ok,

This is slightly off-topic but what do people in here use to create 
their digital logic block diagrams?

I am hopefully looking for something that runs on Win2000, but I'd 
consider Linux/FreeBSD alternatives.

Currently I am using "SmartDraw" (www.smartdraw.com) but I find my block 
digrams look messy and it's not easy to "rubber band" things like clock 
signals. I sometimes like to place text in a block (i.e. "Counter") but 
also like to leave room for connections like a clock triangle (>) and 
text for EN(able) but when I'm finished the results aren't appealing.

I've also used Edge Diagrammer (from Pacesoft) and found it was "weird" 
in  many respects.

Both Edge and SmartDraw are annoying in one particular aspect; if a line 
is "attached" to an object, when the object is moved, instead of using 
right-angles to re-draw the lines, the connecting line is put at an 
angle.

I know a lot of people in general use Visio, what is considered the 
best, most lean version of Visio? I loathe Office and especially 
anything beyond Office '97.

I'm also interested to hear if anyone uses Kivo/Kivo MP.

Thanks!
Jay.

Article: 59983
Subject: Re: Altera Devices
From: Martin Thompson <martin.j.thompson@trw.com>
Date: 03 Sep 2003 08:53:47 +0100
Links: << >> << T >> << A >>

"Maciek" <mkazula@elka.pw.edu.pl> writes:

> > Hi Maciek,
> >
> > What device are you targetting?
> >
> > The older (10K/1K) series don't have dedicated clock enable signals,
> > or synchronous presets/clears, so they take up one of your inputs
> > each.
> >
> > This changed with later devices, eg Apex has dedicated CE inputs for
> > the LE, and LAB wide synch clear and load signals.  The data that is
> > loaded uses the d3 input still.
> >
> > Looking at the Cyclone datasheet, it seems to be similar to Apex in
> > that regard.
> >
> > Does that help?
> >
> > Cheers,
> > Martin
> Hi Martin!
>     Thank You for Your suggestion about checking the targeting device. I
> still don't know how to use AHDL to take advantage of dedicated control
> inputs in Apex devices, but I think I will solve this problem soon.

I think (although I'm not an AHDL man myself) that your code ought to
do the right thing, *if* the device you're targetting has a
capability... I've bashed in what I think your HDL is doing as a
schematic (Quartus II 2.2). If you'd like a copy of the file, let me
know by email and I'll send it you.

Targetting a Cyclone device it produces one LUT and a FF:

From the EQN file...
inst_lut_out = c_prev # a & (b0 # b1);
inst = DFFEA(inst_lut_out, GLOBAL(CLK), GLOBAL(RESETn), CKE, , );

Running your AHDL file produced 2 LUTs, as you said, which seems wrong
- I'd open a case with Altera obout it...

Just for completeness, targetting a 1K (Acex) gives:

inst_lut_out = A1L8;
inst = DFFEA(inst_lut_out, GLOBAL(CLK), GLOBAL(RESETn), , CKE, , );
A1L8 = c_prev # a & (b1 # b0);

thereby using two LEs.

It's not easy to see in the floorplanner what's going on, as both
devices claim to be using a DFFE with a clock enable pin, but you
can;t see the way that the 3rd input is being used as that CKE signal
in the 1K architecure.  Why do we not have an FPGA Editor for Altera
for tracking these low-level things down...?

Oh well, hope that helps a little!

Cheers,
Martin

-- 
martin.j.thompson@trw.com
TRW Conekt, Solihull, UK
http://www.trw.com/conekt

Article: 59984
Subject: Re: altera latch synthesis
From: Martin Thompson <martin.j.thompson@trw.com>
Date: 03 Sep 2003 09:03:31 +0100
Links: << >> << T >> << A >>

"Andrea" <NSP_g.cocchi@NSP_swapp-eng.it> writes:

> Hi all,
> 
> I'm working with Quartus II 2.2 sp2 and my target fpga is APEX20KE.
> I have described a latch in my VHDL code with an enable signal, something
> like this
> 

<snip code>

> Quartus II synthesizes this code mapping the latch on a lut with the
> following equation:
> 
> --A1L5 is q_local~1 at LC3_1_S1
> --operation mode is normal
> A1L5 = clk & (en & d # !en & A1L5) # !clk & A1L5;
> 
> The gate level backannotated simulation produces oscillation on q output.
> Actually this what I see in a more complex design (this is only an example).
> Looking the equations it seems to be all ok, even if SDF extracted doesn't
> contain any TIMING CHECKS for combinatorial part of a LE, nevertheless the
> equation describes a combinatorial loop.
> 

Do you really want a latch?  As you've seen, the architecure doesn't
support it directly, so it will get synthesised using luts.  The
static timing won't necesasrily be abel to handle it as it's an
asynchronous latch, so you'll be on your own timing wise.


> Any suggestion?
> 

What was the question exactly?


Cheers,
Martin
-- 
martin.j.thompson@trw.com
TRW Conekt, Solihull, UK
http://www.trw.com/conekt

Article: 59985
Subject: How to extend a pulse width without clock!
From: peter.zhu@utstar.com (peterzhu)
Date: 3 Sep 2003 01:30:02 -0700
Links: << >> << T >> << A >>

Due to a chip bug, I have to extend a pulse width(negative)from 10ns
to 100ms in CPLD(Altera 7128). But the difficult is that I have no any
clock into the CPLD, so the CPLD is pure combination logic. how to
extend it in such case?

Help me!

Article: 59986
Subject: Re: EDK problem!
From: Rienk van der Scheer <R.van.der.Scheer@no.3t.spam.nl>
Date: Wed, 03 Sep 2003 10:31:12 +0200
Links: << >> << T >> << A >>

John T. wrote:
> Thanx for the reply. My question is more likely: Where, in what menu do you
> set an external input to be an interrupt source??? Where do you declare the
> name of the interrupt function??? With a timer you can do that by
> right-clicking the timer and chose a name for the "timer interrupt handler
> function".

In XPS you can open the system port properties dialog (rightclick on the 
system overview and select Object properties). In this dialog you can 
specify the class of an input signal I/O (empty), Clock or Interrupt.
When you select interrupt, you can add it to the Intr port of the 
interrupt controller.

If you want to debounce the switch first, you probably have to write 
your own peripheral. In that case you can indicate that the output port 
of this peripheral is of Interrupt type.

Regards,

Rienk

Article: 59987
Subject: Re: altera latch synthesis
From: "Andrea" <NSP_g.cocchi@NSP_swapp-eng.it>
Date: Wed, 3 Sep 2003 10:48:14 +0200
Links: << >> << T >> << A >>

I don't want to modify a complex VHDL code, moreover the architecture
pipeline needs latch to work properly without loosing performances.
What do you suggest?

Andrea


"Martin Thompson" <martin.j.thompson@trw.com> ha scritto nel messaggio
news:ud6ei47u4.fsf@trw.com...
> "Andrea" <NSP_g.cocchi@NSP_swapp-eng.it> writes:
>
> > Hi all,
> >
> > I'm working with Quartus II 2.2 sp2 and my target fpga is APEX20KE.
> > I have described a latch in my VHDL code with an enable signal,
something
> > like this
> >
>
> <snip code>
>
> > Quartus II synthesizes this code mapping the latch on a lut with the
> > following equation:
> >
> > --A1L5 is q_local~1 at LC3_1_S1
> > --operation mode is normal
> > A1L5 = clk & (en & d # !en & A1L5) # !clk & A1L5;
> >
> > The gate level backannotated simulation produces oscillation on q
output.
> > Actually this what I see in a more complex design (this is only an
example).
> > Looking the equations it seems to be all ok, even if SDF extracted
doesn't
> > contain any TIMING CHECKS for combinatorial part of a LE, nevertheless
the
> > equation describes a combinatorial loop.
> >
>
> Do you really want a latch?  As you've seen, the architecure doesn't
> support it directly, so it will get synthesised using luts.  The
> static timing won't necesasrily be abel to handle it as it's an
> asynchronous latch, so you'll be on your own timing wise.
>
>
> > Any suggestion?
> >
>
> What was the question exactly?
>
>
> Cheers,
> Martin
> --
> martin.j.thompson@trw.com
> TRW Conekt, Solihull, UK
> http://www.trw.com/conekt

Article: 59988
Subject: Re: How to extend a pulse width without clock!
From: "Simon Peacock" <nowhere@to.be.found>
Date: Wed, 3 Sep 2003 20:50:36 +1200
Links: << >> << T >> << A >>

Am not sure you can.. all the logic on the chip can't generate that kind of
delay..  but you might find another signal which you can use as a clock A0
if you have a micro.. or ALE.. WR.. RD something like that.. failing that..
an RC off chip :-)

Simon


"peterzhu" <peter.zhu@utstar.com> wrote in message
news:61c1427f.0309030030.57cc99c4@posting.google.com...
> Due to a chip bug, I have to extend a pulse width(negative)from 10ns
> to 100ms in CPLD(Altera 7128). But the difficult is that I have no any
> clock into the CPLD, so the CPLD is pure combination logic. how to
> extend it in such case?
>
> Help me!

Article: 59989
Subject: Re: Thinking out loud about metastability
From: oen_br@yahoo.com.br (Luiz Carlos)
Date: 3 Sep 2003 03:04:18 -0700
Links: << >> << T >> << A >>

> Electron spin has all the same measurement issues that a FF has.  If the
> state of the electron spin is changing as the measurement is made, then
> what state is it in?  What will be the result of the measurement?  
> 

Rick, the electron spin is +1/2 or -1/2, there is no in between state,
it changes instantaneously (in one fundamental clock tick, ~10^-43
seconds).

Luiz Carlos

Article: 59990
Subject: Re: Compact FIR filters with multiplier blocks?
From: "Ken" <aeu96186_MENOWANTSPAM@yahoo.co.uk>
Date: Wed, 3 Sep 2003 11:07:11 +0100
Links: << >> << T >> << A >>

Ray,

I sent this to Michael via email and he suggested the group would be
interested also...

My PhD (now drawing to the end) has been on implementing full-parallel
Transpose FIR filters using multiplier blocks that you mention (I use
techniques/algorithms that exceed the efficiency of CSD in terms of FPGA
area).

The upshot of my work is that I have written a C++ program that will
generate RTL VHDL given the quantised filter coefficients, the type of
filter required (singlerate, interpolation, decimation etc.) and the
appropriate parameters (input width, signed/unsigned input, number of
channels, rate-change factor etc.)

The VHDL my program generates exceeds the functionality (at a lower
cost) of that provided by Xilinx's Distributed Arithmetic core and Altera's
FIR Compiler (also DA).  In fact, my program allows interpolation and
decimation factors up to the number of filter coefficients and any number of
data channels (for interpolation/decimation filters also).

The main point is that, once synthesised and mapped to a specific FPGA, the
filters my program generates require far less FPGA area (slices/logic cells)
than those generated using Distributed Arithmetic.  The critical path in my
filters is just the longest adder carry chain so very high speeds are
possible.  E.g. 154MHz for a singlerate filter (25 bit output) in a Xilinx
xc2v3000-fg676-5 - obviously the speed will depend on the device
family/speed grade and the longest carry chain.  The facility for multiple
channels in interpolation/decimation filters (not supported by Xilinx)
allows lower than full-parallel sampling rates to be efficiently processed
in one filter.

As Michael points out in his post, this technique would be very suitable for
a
Xilinx Spartan-IIE and indeed any FPGA - there are many cases where these
filters would be useful even on devices with dedicated multipliers (when
they are all in use for example!  ;-)   ).

You can find out more at http://www.dspec.org/rsg.asp - there are also
datasheets here that provide comparisons with Xilinx and Altera and
demonstrate the output of another application (written in java) that
generates schematic representations of the filters for use in reports,
meetings and thesises!  :-)

I hope this information is of use to you - please contact me if you have any
questions,

Thanks for your time,

Ken

-- 
To reply by email, please remove the _MENOWANTSPAM from my email address.

"Ray Andraka" <ray@andraka.com> wrote in message
news:3F54F936.5E694FD1@andraka.com...
> The problem with the multiplier block approach is that the
> construction is predicated on the specific coefficients.  As
> a result it is considerably harder to use for an arbitrary
> set of coefficients.  It may reduce area over a straight FIR
> filter running at the same clocks per sample, but at a
> considerable cost in design time and flexibility.  You also
> give up regularity in the structure, which may reduce the
> overall performance.   Essentially what the block multiplier
> and distributed arithmetic approaches are is a rearrangement
> of the bitwise product terms.  The mutliplier block takes
> advantage of duplicate terms by adding the inputs before
> they are multiplied by the term.
>
> Michael Spencer wrote:
>
> > Hello,
> >
> > Has anyone compared FPGA implementations of full-rate
> > digital FIR filters based on the use of Multiplier Blocks
> > vs. traditional FIRs with constant coefficient
> > multipliers? By full rate, I mean: one output result per
> > clock cycle and no interpolation or decimation.
> >
> > For anyone not familiar, a multiplier block is a network
> > of shifters and adders that performs multiplications by
> > several coefficients efficiently by exploiting common
> > sub-expressions. The multiplier block can be exploited in
> > FIR filters by transposing the standard filter so that the
> > products of all the coefficients with the current
> > input-sample are required simultaneously.
> >
> > Also, by representing the coefficients in the
> > Canonical-Signed-Digit number system (a small number of
> > +1 and -1's) along common sub-expression sharing the
> > multiplier block can get even smaller.
> >
> > For example, the multiplier block for a 100 tap FIR filter
> > (fp=0.10 and fs=0.12) can be realized with only 61 adds
> > (zero explicit multiplications). See filter example #4 in
> > "FIR Filter Synthesis Algorithms for Minimizing the Delay
> > and the Number of Adders,"
> > http://ics.kaist.ac.kr/~dk/papers/TCAD2001.pdf
> > If the adder depth is constrained to a maximum of four,
> > then the authors' algorithm can do the multiplier block in
> > 69 additions.
> >
> > It would seem that this approach would be very efficient
> > in a target such as the Xilinx Spartan-IIE (with no
> > dedicated multipliers).
> >
> > Another question: If we only need one result per K clock
> > periods (K ~= 1000 for audio applications), could a
> > multiplier block approach realized with, say, bit-serial
> > addition be more efficient than some other approach such
> > as distributed arithmetic?
> >
> > Comments welcome. Thanks.
> >
> > -Michael
> > ______________________
> > Michael E. Spencer, Ph.D.
> > President
> > Signal Processing Solutions, Inc.
> > Web: http://www.spsolutions.com
>
> --
> --Ray Andraka, P.E.
> President, the Andraka Consulting Group, Inc.
> 401/884-7930     Fax 401/884-7950
> email ray@andraka.com
> http://www.andraka.com
>
>  "They that give up essential liberty to obtain a little
>   temporary safety deserve neither liberty nor safety."
>                                           -Benjamin
> Franklin, 1759
>
>

Article: 59991
Subject: Newbie CAN Core question - Student
From: f.sethna@sussex.ac.uk (Fouad)
Date: 3 Sep 2003 03:18:43 -0700
Links: << >> << T >> << A >>

Hi,
I'm new to FPGA's... I have a problem that requires me to use a VHDL CAN
(Controller area Network) Core... I got the core from an open source..
and I get an error as below:

FATAL_ERROR:Xst:Portability/export/Port_Main.h:126:1.13 - This
application has discovered an exceptional condition from which it
cannot recover.  Process will terminate.  To resolve this error,
please consult the Answers Database and other online resources at
http://support.xilinx.com. If you need further assistance, please open
a Webcase by clicking on the "WebCase" link at
http://support.xilinx.com
Error: XST failed

the support page doesn't give me any help as they haven't covered this
error yet...
any ideas?

can anyone suggest another freely available CAN core?

Thanks for all your help in advance.

Fouad

Article: 59992
Subject: Re: EDK problem!
From: antti@case2000.com (Antti Lukats)
Date: 3 Sep 2003 03:31:14 -0700
Links: << >> << T >> << A >>

> Thanx for the reply. My question is more likely: Where, in what menu do you
> set an external input to be an interrupt source??? Where do you declare the
> name of the interrupt function??? With a timer you can do that by
> right-clicking the timer and chose a name for the "timer interrupt handler
> function".

uups, I didnt think 
"I usually don't (think), and my wife says its total disaster when I do"
;)

I am afraid to get the int line out you need to define a real dummy
peripheral that simple routes the pin to interrupt controller.
then add this peripheral component, place the ports and connect.
in the vhdl of the component you only have one wire-connection

I should think sometimes sorry, you are right if there is no component
driving interrupt you can not assign int handler either

antti

Article: 59993
Subject: Re: Input comparator
From: oen_br@yahoo.com.br (Luiz Carlos)
Date: 3 Sep 2003 04:09:23 -0700
Links: << >> << T >> << A >>

Hi Austin.

First, as Andrey pointed out, I took the wrong table.
The best values are for GTL:
Vout = low,  if Vin <= Vref - 0.05
Vout = high, if Vin >= Vref + 0.05

> There is some samll offset voltage from the mis-match between the
> differential pairs (both nmos and cmos to cover the voltage range).  I
> do not know what this offset might be, but I suspect it is less than a
> few tens of millivolts, worst case from the transistor models.
> 
> The comparator will switch as soon as the voltage is greater than the
> offset (we spec 100 mV for speed reasons, not because it needs > 100 mv
> to function).
> 
> So with 50 mV it will switch, just more slowly than if it was 100 mV.

I really would like to know what this offset is and to have a speed
versus offset formula.

But, let's suppose we didn't reach the offset value. If we range Vin
from Vref-Voffset to Vref+Voffset, I think Vout will range from Vlow
to Vhigh monotonically, maybe almost linearly. Am I right?

Now, if we sample Vout with the input data flip-flop, FF DOUT will be
0 or 1 (forget about metastability for now). Can I say there is Vthr
where: if Vout>Vthr then DOUT=1 and if Vout<Vthr then DOUT=0? What
does happen when we feed a data flip-flop with an analog signal?

I also would like to know if when we define an input as LVTTL (for
example), the same input comparator is used (Vref connected to an
internal reference), or if it is bypassed.

Luiz Carlos

Article: 59994
Subject: Re: Input comparator
From: oen_br@yahoo.com.br (Luiz Carlos)
Date: 3 Sep 2003 04:17:26 -0700
Links: << >> << T >> << A >>

> Consider using... an analog comparator!

Yes John.
But, if I can have one for free, why not use it?
Even if it doesn't fit my needs, the knowledge remains.

Luiz Carlos

Article: 59995
Subject: Re: How to extend a pulse width without clock!
From: "Lorenzo" <lorenzol@despammed.com>
Date: Wed, 3 Sep 2003 14:24:25 +0200
Links: << >> << T >> << A >>

"peterzhu" <peter.zhu@utstar.com> ha scritto nel messaggio
news:61c1427f.0309030030.57cc99c4@posting.google.com...

> Due to a chip bug, I have to extend a pulse
> width(negative)from 10ns
> to 100ms in CPLD(Altera 7128).

Can you add some very small components? You can build a very simple
monostable by connecting a RC network between two CPLD pins.

-- 
Lorenzo

Article: 59996
Subject: Using a different editor for ISE 5
From: "Theron Hicks" <hicksthe@egr.msu.edu>
Date: Wed, 3 Sep 2003 09:08:06 -0400
Links: << >> << T >> << A >>

Hello,
    Is there a way that I can (within ISE 5) use a different editor?  I want
the editor to be integrated into the ISE system but I want to use a
different editor.  The feature that I most want is automatic formating of
indents, etc.  ISE does a good job, but if I screw up when I put in a new
block of code I want to add auto-indent (like that available in Matlab).
Folks at Xilinx.... Take this as a hint.

Thanks,
Theron Hicks

Article: 59997
Subject: Re: Compact FIR filters with multiplier blocks?
From: Ray Andraka <ray@andraka.com>
Date: Wed, 03 Sep 2003 09:12:03 -0400
Links: << >> << T >> << A >>

I agree the multiplier block style filters are more efficient area-wise.  It
sounds like you have addressed the irregularity issues by using a program
to do the generation, which I think is pretty much a necessity.  As I thought
I alluded to, the biggest problem with multiplier block filters is that the
layout/size is not a constant if you change the coefficients.  This means that
the fiter coefficients have to be constant and known earlier in the design
cycle, and necessitates a rerun of synthesis, place and route for any filter
changes.  Depending on the implementation, it may also mean a change in the
filter's pipeline latency.  These factors can make them difficult to use on
some projects.  The filters typically used in my projects often need to be
adjusted by the customer or late in the project to accommodate minor
requirements changes.  I prefer to use a filter with reloadable coefficients
for that reason.



Ken wrote:

> Ray,
>
> I sent this to Michael via email and he suggested the group would be
> interested also...
>
> My PhD (now drawing to the end) has been on implementing full-parallel
> Transpose FIR filters using multiplier blocks that you mention (I use
> techniques/algorithms that exceed the efficiency of CSD in terms of FPGA
> area).
>
> The upshot of my work is that I have written a C++ program that will
> generate RTL VHDL given the quantised filter coefficients, the type of
> filter required (singlerate, interpolation, decimation etc.) and the
> appropriate parameters (input width, signed/unsigned input, number of
> channels, rate-change factor etc.)
>
> The VHDL my program generates exceeds the functionality (at a lower
> cost) of that provided by Xilinx's Distributed Arithmetic core and Altera's
> FIR Compiler (also DA).  In fact, my program allows interpolation and
> decimation factors up to the number of filter coefficients and any number of
> data channels (for interpolation/decimation filters also).
>
> The main point is that, once synthesised and mapped to a specific FPGA, the
> filters my program generates require far less FPGA area (slices/logic cells)
> than those generated using Distributed Arithmetic.  The critical path in my
> filters is just the longest adder carry chain so very high speeds are
> possible.  E.g. 154MHz for a singlerate filter (25 bit output) in a Xilinx
> xc2v3000-fg676-5 - obviously the speed will depend on the device
> family/speed grade and the longest carry chain.  The facility for multiple
> channels in interpolation/decimation filters (not supported by Xilinx)
> allows lower than full-parallel sampling rates to be efficiently processed
> in one filter.
>
> As Michael points out in his post, this technique would be very suitable for
> a
> Xilinx Spartan-IIE and indeed any FPGA - there are many cases where these
> filters would be useful even on devices with dedicated multipliers (when
> they are all in use for example!  ;-)   ).
>
> You can find out more at http://www.dspec.org/rsg.asp - there are also
> datasheets here that provide comparisons with Xilinx and Altera and
> demonstrate the output of another application (written in java) that
> generates schematic representations of the filters for use in reports,
> meetings and thesises!  :-)
>
> I hope this information is of use to you - please contact me if you have any
> questions,
>
> Thanks for your time,
>
> Ken
>
> --
> To reply by email, please remove the _MENOWANTSPAM from my email address.
>
> "Ray Andraka" <ray@andraka.com> wrote in message
> news:3F54F936.5E694FD1@andraka.com...
> > The problem with the multiplier block approach is that the
> > construction is predicated on the specific coefficients.  As
> > a result it is considerably harder to use for an arbitrary
> > set of coefficients.  It may reduce area over a straight FIR
> > filter running at the same clocks per sample, but at a
> > considerable cost in design time and flexibility.  You also
> > give up regularity in the structure, which may reduce the
> > overall performance.   Essentially what the block multiplier
> > and distributed arithmetic approaches are is a rearrangement
> > of the bitwise product terms.  The mutliplier block takes
> > advantage of duplicate terms by adding the inputs before
> > they are multiplied by the term.
> >
> > Michael Spencer wrote:
> >
> > > Hello,
> > >
> > > Has anyone compared FPGA implementations of full-rate
> > > digital FIR filters based on the use of Multiplier Blocks
> > > vs. traditional FIRs with constant coefficient
> > > multipliers? By full rate, I mean: one output result per
> > > clock cycle and no interpolation or decimation.
> > >
> > > For anyone not familiar, a multiplier block is a network
> > > of shifters and adders that performs multiplications by
> > > several coefficients efficiently by exploiting common
> > > sub-expressions. The multiplier block can be exploited in
> > > FIR filters by transposing the standard filter so that the
> > > products of all the coefficients with the current
> > > input-sample are required simultaneously.
> > >
> > > Also, by representing the coefficients in the
> > > Canonical-Signed-Digit number system (a small number of
> > > +1 and -1's) along common sub-expression sharing the
> > > multiplier block can get even smaller.
> > >
> > > For example, the multiplier block for a 100 tap FIR filter
> > > (fp=0.10 and fs=0.12) can be realized with only 61 adds
> > > (zero explicit multiplications). See filter example #4 in
> > > "FIR Filter Synthesis Algorithms for Minimizing the Delay
> > > and the Number of Adders,"
> > > http://ics.kaist.ac.kr/~dk/papers/TCAD2001.pdf
> > > If the adder depth is constrained to a maximum of four,
> > > then the authors' algorithm can do the multiplier block in
> > > 69 additions.
> > >
> > > It would seem that this approach would be very efficient
> > > in a target such as the Xilinx Spartan-IIE (with no
> > > dedicated multipliers).
> > >
> > > Another question: If we only need one result per K clock
> > > periods (K ~= 1000 for audio applications), could a
> > > multiplier block approach realized with, say, bit-serial
> > > addition be more efficient than some other approach such
> > > as distributed arithmetic?
> > >
> > > Comments welcome. Thanks.
> > >
> > > -Michael
> > > ______________________
> > > Michael E. Spencer, Ph.D.
> > > President
> > > Signal Processing Solutions, Inc.
> > > Web: http://www.spsolutions.com
> >
> > --
> > --Ray Andraka, P.E.
> > President, the Andraka Consulting Group, Inc.
> > 401/884-7930     Fax 401/884-7950
> > email ray@andraka.com
> > http://www.andraka.com
> >
> >  "They that give up essential liberty to obtain a little
> >   temporary safety deserve neither liberty nor safety."
> >                                           -Benjamin
> > Franklin, 1759
> >
> >

--
--Ray Andraka, P.E.
President, the Andraka Consulting Group, Inc.
401/884-7930     Fax 401/884-7950
email ray@andraka.com
http://www.andraka.com

 "They that give up essential liberty to obtain a little
  temporary safety deserve neither liberty nor safety."
                                          -Benjamin Franklin, 1759

Article: 59998
Subject: Re: Newbie CAN Core question - Student
From: antti@case2000.com (Antti Lukats)
Date: 3 Sep 2003 06:40:45 -0700
Links: << >> << T >> << A >>

f.sethna@sussex.ac.uk (Fouad) wrote in message news:<8c409408.0309030218.7d94baef@posting.google.com>...
> Hi,
> I'm new to FPGA's... I have a problem that requires me to use a VHDL CAN
> (Controller area Network) Core... I got the core from an open source..
> and I get an error as below:
> 
> FATAL_ERROR:Xst:Portability/export/Port_Main.h:126:1.13 - This
> the support page doesn't give me any help as they haven't covered this
> error yet... any ideas?

this happens pretty often its similar to Windows GPF fault - if something
is internally wrong with XST it fails with that Port_main.h:126 ...

hope it will be better with the next release of ISE so far no cure is known.

TRY - clean project, close ISE start over try again, sometimes the problem
goes away by itself, if that doesnt help try to 

UUPS this time it DOES NOT HELP :(

both vhdl and verilog version fail
I guess the problem is in bad verilog/vhdl support for array types
so the problem could be solved by rewriting the
can_fifo.v or can_fifo.vhd

note that the fatal error occours on can_bsp module not can_fifo module

try: replace can_fifo with known-good-dummy module try to synthesize
if success then seek find and correct the XST problem, and PLEASE
let as know if you succeed to find and/or correct the problem!


 
> can anyone suggest another freely available CAN core?

try both VHLD and Verilog versions, maybe one of them goes OK
see above, not this time :(

all other(s) are commercial $$$

Article: 59999
Subject: Re: How to extend a pulse width without clock!
From: antti@case2000.com (Antti Lukats)
Date: 3 Sep 2003 06:47:13 -0700
Links: << >> << T >> << A >>

> "peterzhu" <peter.zhu@utstar.com> wrote in message
> news:61c1427f.0309030030.57cc99c4@posting.google.com...
> > Due to a chip bug, I have to extend a pulse width(negative)from 10ns
> > to 100ms in CPLD(Altera 7128). But the difficult is that I have no any
> > clock into the CPLD, so the CPLD is pure combination logic. how to
> > extend it in such case?

uups, bad luck - not recommended but if you have enough free pins and logic
you and if the timing is not critical it is possible to make free running
oscillator without external RC components, just connect uneven count of
inverters in ring (ie 3 inverters) as this is astable it will oscillate
with pretty high frequency, this could be divided down, but from about
40MHz down to 100ms its pretty long counter ... and this approuch
really isnt 'recommended' 

as other options build simple RC on IO cells and use that signal

antti

Site Home Archive Home FAQ Home How to search the Archive How to Navigate the Archive
Compare FPGA features and resources

Authors:A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Custom Search